计算机科学 ›› 2022, Vol. 49 ›› Issue (3): 121-128.doi: 10.11896/jsjkx.210200009

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于节点相似性和网络嵌入的复杂网络社区发现算法

杨旭华, 王磊, 叶蕾, 张端, 周艳波, 龙海霞   

  1. 浙江工业大学计算机科学与技术学院 杭州310023
  • 收稿日期:2021-02-01 修回日期:2021-05-11 出版日期:2022-03-15 发布日期:2022-03-15
  • 通讯作者: 龙海霞(longhaixia@zjut.edu.cn)
  • 作者简介:(xhyang@zjut.edu.cn)
  • 基金资助:
    国家自然科学基金(61773348)

Complex Network Community Detection Algorithm Based on Node Similarity and Network Embedding

YANG Xu-hua, WANG Lei, YE Lei, ZHANG Duan, ZHOU Yan-bo, LONG Hai-xia   

  1. College of Computer Science and Technology,Zhejiang University of Technology,Hangzhou 310023,China
  • Received:2021-02-01 Revised:2021-05-11 Online:2022-03-15 Published:2022-03-15
  • About author:YANG Xu-hua,born in 1971,Ph.D,professor,is a senior member of China Computer Federation.His main research interests include machine lear-ning and network science.
    LONG Hai-xia,born in 1987,Ph.D,lecturer.Her main research interests include machine learning,medical image processing and complex network analysis.
  • Supported by:
    National Natural Science Foundation of China(61773348).

摘要: 社区发现算法对分析复杂网络的拓扑和层次结构、预测复杂网络的演化趋势等具有十分重要的意义。传统的社区发现算法划分精度不高,忽略了网络嵌入的重要性。针对这样的问题,提出了基于节点相似性和网络嵌入Node2Vec方法的无参数社区发现算法。首先,使用网络嵌入Node2Vec方法将网络节点映射成欧氏空间中低维向量表示的数据点,计算低维向量表示的数据点之间的余弦相似性,根据相应节点间的最大相似性构建偏好网络,得到初始社区划分,把每个初始社区的最大度节点作为备选节点;然后根据网络平均度和平均最短路径找出备选节点中的中心节点;最后将中心节点对应的数据点及其数量作为初始质心和聚类数,用K-Means算法对低维向量表示的数据点进行聚类,从而对相应的网络节点完成社区划分。该算法为无参数社区划分方法,可以自主地从网络中提取参数,无须根据网络的不同设定不同的超参数,从而可以自动地快速识别复杂网络的社区结构。在8个真实网络和人工网络上,将其与其他5个知名社区发现算法相比较,数值仿真实验表明所提算法具有很好的社区发现效果。

关键词: K-Means聚类, 节点相似性, 偏好网络, 网络嵌入, 无参数社区发现

Abstract: The community detection algorithm is very important for analyzing the topology and hierarchical structure of complex networks and predicting the evolution trend of complex networks.Traditional community detection algorithm does not have high accuracy and ignores the importance of network embedding.Aiming at such problems,a parameter-free community detection algorithm based on node similarity and network embedding Node2Vec method is proposed.First,we use the network embedding Node2Vec method to map network nodes into data points represented by low-dimensional vectors in Euclidean space,calculate the cosine similarity between the data points represented by the low-dimensional vector,construct a preference network according to the maximum similarity between the corresponding nodes,obtain the initial community detection,and use the maximum degree node of each initial community as a candidate node.Then we find the central node among the candidate nodes according to the average degree of the network and the average shortest path.Finally,the data points and their numbers corresponding to the central node are used as the initial centroid and cluster number,and the data represented by the low-dimensional vector are calculated by K-Means algorithm.The points are clustered,and the corresponding network nodes are divided into communities.This algorithm is a method of community division without parameters,which can independently extract parameters from the network without setting different hyper-parameters according to different networks,so that it can automatically and quickly identify the community structure of complex networks.In 8 real networks and artificial networks above,by comparing with other 5 well-known community discovery algorithms,numerical simulation experiments show that the proposed algorithm has good community discovery effect.

Key words: K-Means clustering, Network embedding, Node similarity, Parameter-free community detection, Preference network

中图分类号: 

  • TP391
[1]SAGANOWSKI S.Predicting Community Evolution in SocialNetworks[C]//the 2015 IEEE/ACM International Confe-rence.ACM,2015:3053-3096.
[2]ACMAN M,DORP L V,SANTINI J M,et al.Large-scale net-work analysis captures biological features of bacterial plasmids[J].Nature Communications,2020,11(1):46.
[3]CHOUCHANI N,ABED M.Online social network analysis:de-tection of communities of interest[J].Journal of Intelligent Information Systems,2020,54(2):1-17.
[4]NEWMAN M,GIRVAN M.Finding and evaluating community structure in networks[J].Physica A,2004,69(2):26113.
[5]KRAMER J,BOONE L,CLIFFORD T,et al.Analysis of Medical Data Using Community Detection on Inferred Networks[J].IEEE Journal of Biomedical and Health Informatics,2020,24(11):3136-3143.
[6]KOUNI I B E,KAROUI W,ROMDHANEL B.Node Impor-tance based Label Propagation Algorithm for overlapping community detection in networks[J].Expert Systems with Applications,2019,162:113020.
[7]ZIVICH P N,SMITH N R,FRERICHSL M,et al.A Guide for Choosing Community Detection Algorithms in Social Network Studies:The Question Alignment Approach[J].American Journal of Preventive Medicine,2020,59(4):597-605.
[8]ROSVALL M, BERGSTROM C T.Maps of random walks on complex networks reveal community structure[J].Proceedings of the National Academy of Sciences of the United States of America,2008,105(4):1118-1127.
[9]ZHOU T,LV L Y,ZHANG Y C.Predicting missing links via local information[J].European Physical Journal B,2009,71(4):623-630.
[10]TASGIN M,BINGOL H O.Community detection using prefe-rence networks[J].Physica A:Statistical Mechanics and its Applications,2018,495:126-136.
[11]LIU Z,MA Y.A Divide and Agglomerate Algorithm for Community Detection in Social Networks[J].Information Sciences,2019,482:321-333.
[12]ZHAO X,ZHANG Z H,ZHANG C W,et al.RGNE:A coarse-grained network embedded overlapping community discovery method[J].Computer Research and Development,2020,57(6):1302-1311.
[13]YE Z L,ZHAO H X,ZHANG K,et al.Network representation learning based on neighbor node and relationship model optimization[J].Computer Research and Development,2019,56(12):2562-2577.
[14]ZHAO X,LI X,ZHANG Z H,et al.Community discovery algorithm combining community embedding and node embedding[J].Computer Science,2020,47(10):121-125.
[15]XIE Y,GONG M,WANG S,et al.Community Discovery in Networks with Deep Sparse Filtering[J].Pattern Recognition,2018,81:50-59.
[16]HU F,LIU J,LI L,et al.Community detection in complex networks using Node2vec with spectral clustering[J].Physica A:Statistical Mechanics and its Applications,2020,45(2):13247.
[17]DING Y,WEI H,PAN Z S,et al.Overview of network representation learning algorithms[J].Computer Science,2020,47(9):52-59.
[18]PEROZZI B,AL-RFOU R,SKIENA S.Deepwalk:Online lear-ning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.2014:701-710.
[19]WANG D,CUI P,ZHU W.Structural deep network embedding[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2016:1225-1234.
[20]KIPF T N,WELLING M.Semi-supervised classification withgraph convolutional networks[C]//5th International Conference on Learning Representations.ICLR,2017.
[21]YE J.Improved cosine similarity measures of simplified neutrosophic sets for medical diagnoses[J].Artificial Intelligence in Medicine,2015,63(3):171-179.
[22]ADAMIC L A,ADAR E.Friends and neighbors on the Web[J].Social Networks,2003,25(3):211-230.
[23]RAVASZ E,SOMERA A L,MONGRU D A,et al.Hierarchical Organization of Modularity in Metabolic Networks[J].Science,2002,297(5586):1551-1555.
[24]LEICHT E A,HOLME P,NEWMAN M E J.Vertex similarity in networks[J].Physical Review E Statistical Nonlinear & Soft Matter Physics,2006,73(2):026120.
[25]GODICHON-BAGGIONI A,MAUGIS-RABUSSEAU C,RAU A.Clustering transformed compositional data using K-means,with applications in gene expression and bicycle sharing system data[J].Journal of Applied Statistics,2017,46(1):47-65.
[26]NEWMAN M E J.Finding community structure in networksusing the eigenvectors of matrices[J].Physical Review E,2006,74(3):036104.
[27]CHATTOPADHYAY S,BASU T,DAS A K,et al.A similarity based generalized modularity measure towards effective community discovery in complex networks[J].Physica A:Statistical Mechanics and its Applications,2019,527:121338.
[28]HESAMIPOUR S,BALAFAR M A.A new method for detecting communities and their centers using the Adamic/Adar Index and game theory[J].Physica A:Statistical Mechanics and its Applications,2019,535:122354.
[29]AHAJJAM S,EL HADDAD M,BADIR H.A new scalableleader-community detection approach for community detection in social networks[J].Social Networks,2018,54:41-49.
[30]MA T,LIU Q,CAO J,et al.LGIEM:Global and local node influence based community detection[J].Future Generation Computer Systems,2020,105:533-546.
[31]LIU S S,XIA Z Y.A two-stage BFS local community detection algorithm based on node transfer similarity and Local Clustering Coefficient-Science Direct[J].Physica A:Statistical Mechanics and its Applications,2019(537):122717.
[32]LUO W J,LU N.Local community detection by the nearestnodes with greater centrality[J].Information Sciences,2020,517:377-392.
[33]YOU X,MA Y,LIU Z.A three-stage algorithm on community detection in social networks[J].Knowledge-Based Systems,2020,187(Jan.):104822.1-104822.12.
[34]CHEN D,SU H.Framework based on communicability to mea-sure the similarity of nodes in complex networks[J].Information Sciences,2020,524:241-253.
[35]GUIMERÀ R,DANON L,DÍAZ-GUILERA A,et al.Self-similar community structure in a network of human interactions[J].Physical Review E,2004,68(6 Pt 2):065103.
[36]XU R B,CHE Y,WANG X M,et al.Stacked autoencoder-based community detection method via an ensemble clustering framework[J].Information Sciences,2020,526:151-165.
[1] 陈世聪, 袁得嵛, 黄淑华, 杨明.
基于结构深度网络嵌入模型的节点标签分类算法
Node Label Classification Algorithm Based on Structural Depth Network Embedding Model
计算机科学, 2022, 49(3): 105-112. https://doi.org/10.11896/jsjkx.201000177
[2] 郭磊, 马廷淮.
基于好友亲密度的用户匹配
Friend Closeness Based User Matching
计算机科学, 2022, 49(3): 113-120. https://doi.org/10.11896/jsjkx.210200137
[3] 郑苏苏, 关东海, 袁伟伟.
融合不完整多视图的异质信息网络嵌入方法
Heterogeneous Information Network Embedding with Incomplete Multi-view Fusion
计算机科学, 2021, 48(9): 68-76. https://doi.org/10.11896/jsjkx.210500203
[4] 胡昕彤, 沙朝锋, 刘艳君.
基于随机投影和主成分分析的网络嵌入后处理算法
Post-processing Network Embedding Algorithm with Random Projection and Principal Component Analysis
计算机科学, 2021, 48(5): 124-129. https://doi.org/10.11896/jsjkx.200500058
[5] 杨旭华, 王晨.
基于网络嵌入与局部合力的复杂网络社区划分算法
Community Detection Algorithm in Complex Network Based on Network Embedding and Local Resultant Force
计算机科学, 2021, 48(4): 229-236. https://doi.org/10.11896/jsjkx.200200102
[6] 张健雄, 宋坤, 何鹏, 李兵.
基于图神经网络的软件系统中关键类的识别
Identification of Key Classes in Software Systems Based on Graph Neural Networks
计算机科学, 2021, 48(12): 149-158. https://doi.org/10.11896/jsjkx.210100200
[7] 徐新黎, 肖云月, 龙海霞, 杨旭华, 毛剑飞.
基于矩阵分解的属性网络嵌入和社区发现算法
Attributed Network Embedding Based on Matrix Factorization and Community Detection
计算机科学, 2021, 48(12): 204-211. https://doi.org/10.11896/jsjkx.210300060
[8] 金雨芳, 吴祥, 董辉, 俞立, 张文安.
基于改进YOLO v4的安全帽佩戴检测算法
Improved YOLO v4 Algorithm for Safety Helmet Wearing Detection
计算机科学, 2021, 48(11): 268-275. https://doi.org/10.11896/jsjkx.200900098
[9] 丁钰, 魏浩, 潘志松, 刘鑫.
网络表示学习算法综述
Survey of Network Representation Learning
计算机科学, 2020, 47(9): 52-59. https://doi.org/10.11896/jsjkx.190300004
[10] 吴勇, 王斌君, 翟一鸣, 仝鑫.
共引增强有向网络嵌入研究
Study on Co-citation Enhancing Directed Network Embedding
计算机科学, 2020, 47(12): 279-284. https://doi.org/10.11896/jsjkx.191000199
[11] 赵霞, 李娴, 张泽华, 张晨威.
结合社区嵌入和节点嵌入的社区发现算法
Community Detection Algorithm Combing Community Embedding and Node Embedding
计算机科学, 2020, 47(10): 121-125. https://doi.org/10.11896/jsjkx.191000099
[12] 冶忠林, 赵海兴, 张科, 朱宇.
基于多视图集成的网络表示学习算法
Network Representation Learning Based on Multi-view Ensemble Algorithm
计算机科学, 2019, 46(1): 117-125. https://doi.org/10.11896/j.issn.1002-137X.2019.01.018
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!