计算机科学 ›› 2022, Vol. 49 ›› Issue (10): 103-110.doi: 10.11896/jsjkx.220400145

• 数据库&大数据&数据科学 • 上一篇    下一篇

一种基于局部随机游走的标签传播算法

刘扬1, 郑文萍1,2, 张川1, 王文剑1,2   

  1. 1 山西大学计算机与信息技术学院 太原 030006
    2 山西大学计算智能与中文信息处理教育部重点实验室 太原 030006
  • 收稿日期:2022-04-14 修回日期:2022-06-24 出版日期:2022-10-15 发布日期:2022-10-13
  • 通讯作者: 王文剑(wjwang@sxu.edu.cn)
  • 作者简介:(15835207489@163.com)
  • 基金资助:
    国家自然科学基金(62072292,62076154,U21A20513,U1805263);山西省1331工程项目

Local Random Walk Based Label Propagation Algorithm

LIU Yang1, ZHENG Wen-ping1,2, ZHANG Chuan1, WANG Wen-jian1,2   

  1. 1 School of Computer & Information Technology,Shanxi University,Taiyuan 030006,China
    2 Key Laboratory of Computational Intelligence & Chinese Information Processing of Ministry of Education,Shanxi University,Taiyuan 030006,China
  • Received:2022-04-14 Revised:2022-06-24 Online:2022-10-15 Published:2022-10-13
  • About author:LIU Yang,born in 1998,postgraduate,is a student member of China Computer Federation.Her main research interests include machine learning and network data analysis.
    WANG Wen-jian,born in 1968,Ph.D,professor,Ph.D supervisor,is a distinguished member of China Computer Federation.Her main research interests include machine learning,data mining and computational intelligence.
  • Supported by:
    National Natural Science Foundation of China(62072292,62076154,U21A20513,U1805263) and 1331 Enginee-ring Project of Shanxi Province,China.

摘要: 社区结构是复杂网络的重要特征之一,识别网络中不同功能的社区对理解复杂网络特性具有重要作用。基于标签传播的社区发现算法通常以节点的直接邻居作为邻域更新标签,可能无法准确发现社区结构或导致得到的社区划分结果不稳定。针对此问题,提出了一种基于局部随机游走的标签传播算法(Local Random Walk Based Label Propagation Algorithm,LRW-LPA),利用节点的k步邻域内局部重要性指标选择重要性最低的节点作为起始节点,进行带重启的局部随机游走以确定起始节点的局部邻域;选择此局部邻域范围内出现次数最多且影响值最大的标签来更新起始节点标签。LRW-LPA采用带重启的局部随机游走过程能更准确地确定节点的合适邻域范围,提高了算法的稳定性。与LPA,BGLL,Infomap,Leiden,Walktrap等经典社区发现算法在12个真实网络和12个人工构造网络上的比较实验表明,LRW-LPA算法在标准互信息(NMI)、调整兰德系数(ARI)和模块度(Q)等方面表现良好。

关键词: 复杂网络, 聚类, 社区发现, 标签传播, 局部随机游走

Abstract: Community structure is one of the important characteristics of complex networks.Identifying communities of different functions in a network plays an important role for revealing important characteristics of complex networks.The community discovery algorithm based on label propagation uses the community label of direct neighbors of a node to updates its label,which might obtain inaccurate community structures.Furthermore,the results of multiple runs of the algorithm might be unstable.To solve this problem,a local random walk based label propagation algorithm(LRW-LPA) is proposed.First,the local importance of each node in its k-step neighborhood is calculated.Then,the node with the lowest local importance is selected as the starting node to perform the local random walk process.When walking out of the specified neighborhood,the random walker will return to the starting node and start the random walk again.Finally,the algorithm selects the label with the most occurrences in the local neighborhood to update the label of the starting node,and selects the label of the node with highest importance when there are multiple labels with the most occurrences.Due to LRW-LPA can determine an appropriate neighborhood of a node by adopting the local random walk process with restart,the stability of the algorithm improves greatly.Compared with LPA,BGLL,Infomap,Leiden,Walktrap and other classical algorithms on 12 real networks and 12 synthetic networks,it shows that the proposed LRW-LPA algorithm performs well in terms of normal mutual index(NMI),adjusted rand index(ARI) and modularity(Q).

Key words: Complex network, Clustering, Community detection, Label propagation, Local random walk

中图分类号: 

  • TP391
[1]YANG X H,WANG L,YE L,et al.Complex network community detection algorithm based on node similarity and network embedding[J].Computer Science,2022,49(3):121-128.
[2]NEWMAN M E J,GIRVAN M.Finding and evaluating community structure in networks[J].Physical Review E,2004,69(2):026113.
[3]BLONDEL V D,GUILLAUME J L,LAMBIOTTER,et al.Fast unfolding of communities in large networks[J].Journal of Statistical Mechanics:Theory and Experiment,2008,2008(10):10008.
[4]TRAAG V A,WALTMAN L,VAN ECK N J.From Louvain toLeiden:guaranteeing well-connected communities[J].Scientific Reports,2019,9(1):5233.
[5]FORTUNATO S,BARTHELEMY M.Resolution limit in community detection[J].Proceedings of the National Academy of Sciences,2007,104(1):36-41.
[6]RAGHAVAN U N,ALBERT R,KUMARA S.Near linear timealgorithm to detect community structures in large-scale networks[J].Physical Review E,2007,76(3):036106.
[7]GARZA S E,SCHAEFFER S E.Community detection with the label propagation algorithm:a survey[J].Physica A:Statistical Mechanics and its Applications,2019,534(15):122058.
[8]BARBER M J,CLARK J W.Detecting network communities by propagating labels under constraints[J].Physical Review E,2009,80(2):026129.
[9]YAN X,FANR M,YONG Z,et al.A node influence based label propagation algorithm for community detection in networks[J].The Scientific World Journal,2014,2014:627581.
[10]SUN H,HUANG J,ZHONG X,et al.Label propagation with α-degree neighborhood impact for network community detection[J].Computational Intelligence and Neuroscience,2014,2014:130689.
[11]FORTUNATO S,HRIC D.Community detection in networks:a user guide[J].Physics Reports,2016,659(11):1-44.
[12]ROSVALL M,BERGSTORM C T.Maps of random walks on complex networks reveal community structure[J].Proceedings of the National Academy of Sciences of the United States of America,2008,105(4):1118-1123.
[13]PONS P,LATAPY M.Computing communities in large net-works using random walks[J].Computer and Information Sciences,2005,3733:248-293.
[14]DANON L,DUCH J,GUILERAD A,et al.Comparing community structure identification [J].Journal of Statistical Mecha-nics,2005,2005(9):09008.
[15]RAND W M.Objective criteria for the evaluation of clustering methods[J].Journal of the American Statistical Association,1971,66(336):846-850.
[16]ZACHARY W W.An information flow model for conflict and fission in small groups[J].Journal of Anthropological Research,1977,33(4):452-473.
[17]LUSSEAU D,NEWMAN M E J.Identifying the role that animals play in their social networks[J].Proceedings of the Royal Society B Biological Sciences,2004,271(Suppl_6):S477-S481.
[18]BAI L,CHENG X Q,LIANG J Y,et al.Fast graph clustering with a new description model for community detection[J].Information Sciences,2017,388-389:37-47.
[19]GIRVAN M,NEWMAN M E J.Community structure in social and biological networks[J].Proceedings of the National Academy of Sciences of the United States of America,2002,99(12):7821-7826.
[20]QIAN Y H,LI Y B,ZHANG M,et al.Quantifying edge significance on maintaining global connectivity[J].Scientific Reports,2017,7(1):45380-45392.
[21]CHO A,SHIN J,HWANG S,et al.WormNet v3:a network-assisted hypothesis-generating server for caenorhabditis elegans[J].Nucleic Acids Research,2014,42(W1):W76-W82.
[22]LESKOVEC J,KLEINBERG J,FALOUTSOS C.Graph evolu-tion:densification and shrinking diameters[J].ACM Transactions on Knowledge Discovery from Data,2007,1(1):1217299-1217301.
[23]NEWMAN M E J.The structure of scientific collaboration networks[J].Proceedings of the National Academy of Sciences of the United States of America,2001,98(2):404-409.
[24]CHO E,MYERS S A,LESKOVEC J.Friendship andMobility:User Movement in Location-Based Social Networks[C]//Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2011:1082-1090.
[25]CHE C H.Research on community detection algorithm based on node similarity[D].Taiyuan:Shanxi University,2019.
[1] 柴慧敏, 张勇, 方敏.
基于特征相似度聚类的空中目标分群方法
Aerial Target Grouping Method Based on Feature Similarity Clustering
计算机科学, 2022, 49(9): 70-75. https://doi.org/10.11896/jsjkx.210800203
[2] 郑文萍, 刘美麟, 杨贵.
一种基于节点稳定性和邻域相似性的社区发现算法
Community Detection Algorithm Based on Node Stability and Neighbor Similarity
计算机科学, 2022, 49(9): 83-91. https://doi.org/10.11896/jsjkx.220400146
[3] 鲁晨阳, 邓苏, 马武彬, 吴亚辉, 周浩浩.
基于分层抽样优化的面向异构客户端的联邦学习
Federated Learning Based on Stratified Sampling Optimization for Heterogeneous Clients
计算机科学, 2022, 49(9): 183-193. https://doi.org/10.11896/jsjkx.220500263
[4] 刘丽, 李仁发.
医疗CPS协作网络控制策略优化
Control Strategy Optimization of Medical CPS Cooperative Network
计算机科学, 2022, 49(6A): 39-43. https://doi.org/10.11896/jsjkx.210300230
[5] 杨波, 李远彪.
数据科学与大数据技术课程体系的复杂网络分析
Complex Network Analysis on Curriculum System of Data Science and Big Data Technology
计算机科学, 2022, 49(6A): 680-685. https://doi.org/10.11896/jsjkx.210800123
[6] 鲁晨阳, 邓苏, 马武彬, 吴亚辉, 周浩浩.
基于DBSCAN聚类的集群联邦学习方法
Clustered Federated Learning Methods Based on DBSCAN Clustering
计算机科学, 2022, 49(6A): 232-237. https://doi.org/10.11896/jsjkx.211100059
[7] 郁舒昊, 周辉, 叶春杨, 王太正.
SDFA:基于多特征融合的船舶轨迹聚类方法研究
SDFA:Study on Ship Trajectory Clustering Method Based on Multi-feature Fusion
计算机科学, 2022, 49(6A): 256-260. https://doi.org/10.11896/jsjkx.211100253
[8] 何亦琛, 毛宜军, 谢贤芬, 古万荣.
基于点割集图分割的矩阵变换与分解的推荐算法
Matrix Transformation and Factorization Based on Graph Partitioning by Vertex Separator for Recommendation
计算机科学, 2022, 49(6A): 272-279. https://doi.org/10.11896/jsjkx.210600159
[9] 毛森林, 夏镇, 耿新宇, 陈剑辉, 蒋宏霞.
基于密度敏感距离和模糊划分的改进FCM算法
FCM Algorithm Based on Density Sensitive Distance and Fuzzy Partition
计算机科学, 2022, 49(6A): 285-290. https://doi.org/10.11896/jsjkx.210700042
[10] 陈景年.
一种适于多分类问题的支持向量机加速方法
Acceleration of SVM for Multi-class Classification
计算机科学, 2022, 49(6A): 297-300. https://doi.org/10.11896/jsjkx.210400149
[11] 何茜, 贺可太, 王金山, 林绅文, 杨菁林, 冯玉超.
比特币实体交易模式分析
Analysis of Bitcoin Entity Transaction Patterns
计算机科学, 2022, 49(6A): 502-507. https://doi.org/10.11896/jsjkx.210600178
[12] 陈佳舟, 赵熠波, 徐阳辉, 马骥, 金灵枫, 秦绪佳.
三维城市场景中的小物体检测
Small Object Detection in 3D Urban Scenes
计算机科学, 2022, 49(6): 238-244. https://doi.org/10.11896/jsjkx.210400174
[13] 邢云冰, 龙广玉, 胡春雨, 忽丽莎.
基于SVM的类别增量人体活动识别方法
Human Activity Recognition Method Based on Class Increment SVM
计算机科学, 2022, 49(5): 78-83. https://doi.org/10.11896/jsjkx.210400024
[14] 朱哲清, 耿海军, 钱宇华.
面向化学结构的线段聚类算法
Line-Segment Clustering Algorithm for Chemical Structure
计算机科学, 2022, 49(5): 113-119. https://doi.org/10.11896/jsjkx.210700131
[15] 张宇姣, 黄锐, 张福泉, 隋栋, 张虎.
基于菌群优化的近邻传播聚类算法研究
Study on Affinity Propagation Clustering Algorithm Based on Bacterial Flora Optimization
计算机科学, 2022, 49(5): 165-169. https://doi.org/10.11896/jsjkx.210800218
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!