University of Leicester
Browse

ELSNC: A semi-supervised community detection method with integration of embedding-enhanced links and node content in attributed networks

journal contribution
posted on 2025-01-23, 11:12 authored by Jinxin Cao, Xiaoyang Zou, Weizhong Xu, Weiping Ding, Hengrong Ju, Lu Liu, Fuxiang Chen, Di Jin

In complex network analysis, detecting communities is becoming increasingly important. However, it is difficult to fuse multiple types of information to enhance the community-detection performance in real-world applications. Besides the nodes and the edges, a network also contains the structure of communities, its networking topological structure, and the network embeddings. Note that existing works on community detection have limited usage of all these information types in combination. In this work, we designed a novel unified model called embedding-enhanced link-based semi-supervised community detection with node content (ELSNC). ELSNC integrates the structure of the topology, the priori information, the network embeddings, and the node content. First, we employ two non-negative matrix factorization (NMF)–based stochastic models to characterize the node-community membership and the content-community membership (by performing similarity detection between a topic model and the NMF). Second, we introduce the nodes’ and networking embeddings’ topological similarity into the model as topological information. To model the topological similarity, we introduce a strong constraint (i.e., the priori information) and apply matrix completion to identify the community membership with the network embeddings’ representation ability. Finally, we present a semi-supervised community-detection method based on NMF that combines the network topology, content information, and the network embeddings. Our work’s innovation can be captured in two points: 1) As a type of semi-supervised community detection method, we extend the theory of semi-supervised methods on attributed networks and propose a unified model that integrates multiple information types. 2) The community membership obtained by the unified model simultaneously contains different information, including the topological, content, priori, and embedding information, which can more robustly be explored in the community structure in real-world scenarios. Furthermore, we performed a comprehensive evaluation of our proposed approach compared with state-of-the-art methods on both synthetic and real-world networks. The results show that our proposed method significantly outperformed the baseline methods.

Funding

The work was supported by the National Natural Science Foundation of China (No. 61976120), the Natural Science Foundation of Jiangsu Province (BK20231337), the Natural Science Key Foundation of Jiangsu Education Department (21KJA510004), and the Natural Science Foundation of the Higher Education Institutions of Jiangsu Province (21KJB520018).

History

Author affiliation

College of Science & Engineering Comp' & Math' Sciences

Version

  • AM (Accepted Manuscript)

Published in

Applied Soft Computing

Volume

167

Issue

Part A

Pagination

112250

Publisher

Elsevier BV

issn

1568-4946

Copyright date

2024

Available date

2025-09-24

Language

en

Deposited by

Dr Fuxiang Chen

Deposit date

2024-11-22

Usage metrics

    University of Leicester Publications

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC