计算机科学 ›› 2022, Vol. 49 ›› Issue (12): 155-162.doi: 10.11896/jsjkx.220500190

• 数据库&大数据&数据科学 • 上一篇    下一篇

一种基于局部路径信息的重叠社区发现算法

郑文萍1,2,3, 王宁1, 杨贵1   

  1. 1 山西大学计算机与信息技术学院 太原030006
    2 计算智能与中文信息处理教育部重点实验室(山西大学) 太原030006
    3 智能信息处理研究所(山西大学) 太原030006
  • 收稿日期:2022-05-19 修回日期:2022-08-27 发布日期:2022-12-14
  • 通讯作者: 郑文萍 (wpzheng@sxu.edu.cn)
  • 基金资助:
    国家自然科学基金(62072292);山西省1331工程项目

Overlapping Community Detection Algorithm Based on Local Path Information

ZHENG Wen-ping1,2,3, WANG Ning1, YANG Gui1   

  1. 1 School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China
    2 Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education,Shanxi University,Taiyuan 030006,China
    3 Institute of Intelligent Information Processing,Shanxi University,Taiyuan 030006,China
  • Received:2022-05-19 Revised:2022-08-27 Published:2022-12-14
  • About author:ZHENG Wen-ping,born in 1979,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.Her main research interests include graph theory algorithms and bioinformatics.
  • Supported by:
    National Natural Science Foundation of China(62072292) and 1331 Engineering Project of Shanxi Province,China.

摘要: 重叠社区发现是复杂网络分析的主要任务之一。针对现有的基于局部扩展和优化的重叠社区发现方法受初始种子节点选择影响较大、适应度函数无法度量节点间多样的连接方式等问题,提出了一种基于局部路径信息的重叠社区发现算法(Local Path Information-based Overlapping Community Detection Algorithm,LPIO)。首先选取局部极大度点作为初始种子节点,并根据社区内节点邻域标签一致性更新社区的种子节点集,避免初始种子节点对算法性能的影响;然后为度量稀疏网络中节点间多样的连接方式,给出了基于局部路径信息的社区适应度函数,扩展种子节点集得到社区结构;最后计算未聚类节点与社区种子集之间的点不重复路径数量,得到未聚类节点与已有社区间的距离,为未聚类节点分配社区。在4个有标签网络和8个无标签网络上,与7个经典重叠社区发现算法进行对比,实验结果表明,所提算法在重叠标准互信息(ONMI)、F1分数、扩展模块度(EQ)等方面表现良好。

关键词: 重叠社区发现, 局部扩展和优化, 社区适应度, 局部路径信息

Abstract: The detection of overlapping communities is one of the main tasks of complex network analysis.The performance of most existing methods based on local expansion and optimization are greatly affected by the selection of initial seed nodes and the community structure significance measurement.Aiming at these problems,an overlapping community detection algorithm is proposed based on local path information(LPIO).First,the local maximum degree points are selected as initial seeds,which will be updated according to the label consistency of node’s neighborhood in the community to reduce the influence of the selection of initial seeds.To measure the various connection patterns between nodes in networks,a community fitness function is defined based on local path information to obtain community structures from seed nodes.Finally,unclustered nodes are assigned to proper communities according to the number of non-repetitive paths between the unclustered nodes and the community seed sets.Comparative experiments on 4 labeled networks and 8 unlabeled networks with 7 classic overlapping community detection algorithms show that the proposed algorithm performs well on overlapping standard mutual information(ONMI),F1 score,and extended modularity(EQ).

Key words: Overlapping community detection, Local expansion and optimization, Community fitness, Local path information

中图分类号: 

  • TP391
[1]CHENG F,WANG C,ZHANG X,et al.A local-neighborhood information based overlapping community detection algorithm for large-scale complex networks[J].IEEE/ACM Transactions on Networking,2020,29(2):543-556.
[2]PALLA G,DERÉNYI I,FARKAS I,et al.Uncovering the overlapping community structure of complex networks in nature and society[J].Nature,2005,435(7043):814-818.
[3]JABBOUR S,MHADHBI N,RADAOUI B,et al.Detectinghighly overlapping community structure by model-based maximal clique expansion[C]//2018 IEEE International Conference on Big Data(Big Data).IEEE,2018:1031-1036.
[4]GREGORY S.Finding overlapping communities in networks by label propagation[J].New Journal of Physics,2010,12(10):103018.
[5]XIE J,SZYMANSKI B K,LIU X.SLPA:Uncovering overlapping communities in social networks via a speaker-listener interaction dynamic process[C]//2011 IEEE 11th International Conference on Data Mining Workshops.IEEE,2011:344-349.
[6]LU M,ZHANG Z,QU Z,et al.LPANNI:Overlapping community detection using label propagation in large-scale complex networks[J].IEEE Transactions on Knowledge and Data Enginee-ring,2018,31(9):1736-1749.
[7]LUO W,ZHANG D,NI L,et al.Multiscale local community detection in social networks[J].IEEE Transactions on Knowledge and Data Engineering,2021,33(3):1102-1112.
[8]LANCICHINETTI A,FORTUNATO S,KERTÉSZ J.Detecting the overlapping and hierarchical community structure in complex networks[J].New journal of physics,2009,11(3):033015.
[9]LEE C,REID F,MCDAID A,et al.Detecting highly overlapping community structure by greedy clique expansion[C]//Procee-dings of the 4th International Workshop on Social Network Mi-ning and Analysis(SNA-KDD).2010:33-42.
[10]WANG X,LIU G,LI J.Overlapping community detection based on structural centrality in complex networks[J].IEEE Access,2017,5:25258-25269.
[11]REZVANI M,LIANG W,LIU C,et al.Efficient detection of overlapping communities using asymmetric triangle cuts[J].IEEE Transactions on Knowledge and Data Engineering,2018,30(11):2093-2105.
[12]CHOUMANE A,AWADA A,HARKOUS A.Core expansion:A new community detection algorithm based on neighborhood overlap[J].Social Network Analysis and Mining,2020,10(1):1-11.
[13]RADICCHI F,CASTELLANO C,CECCONI F,et al.Definingand identifying communities in networks[J].Proceedings of the National Academy of Sciences,2004,101(9):2658-2663.
[14]KANNAN R,VEMPALA S,VETTA A.On clusterings:Good,bad and spectral[J].Journal of the ACM(JACM),2004,51(3):497-515.
[15]NEWMAN M E J.Modularity and community structure in networks[J].Proceedings of the National Academy of Sciences,2006,103(23):8577-8582.
[16]CARNIVALI G S,VIEIRA A B,ZIVIANI A,et al.CoVeC:coarse-grained vertex clustering for efficient community detection in sparse complex networks[J].Information Sciences,2020,522:180-192.
[17]FORTUNATO S,BARTHELEMY M.Resolution limit in community detection[J].Proceedings of the National Academy of Sciences,2007,104(1):36-41.
[18]RAGHAVAN U N,ALBERT R,KUMARA S.Near linear time algorithm to detect community structures in large-scale networks[J].Physical Review E,2007,76(3):036106.
[18]COSCIA M,ROSSETTI G,GIANNOTTI F,et al.Demon:a local-first discovery method for overlapping communities[C]//Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2012:615-623.
[20]SOUNDARAJAN S,HOPCROFT J E.Use of local group information to identify communities in networks[J].ACM Transactions on Knowledge Discovery from Data(TKDD),2015,9(3):1-27.
[21]CHAKRABORTY T,DALMIA A,MUKHERJEE A,et al.Metrics for community analysis:A survey[J].ACM Computing Surveys(CSUR),2017,50(4):1-37.
[1] 段小虎, 曹付元.
基于节点局部相似性的两阶段密度峰值重叠社区发现方法
Node Local Similarity Based Two-stage Density Peaks Algorithm for Overlapping Community Detection
计算机科学, 2022, 49(12): 170-177. https://doi.org/10.11896/jsjkx.211000025
[2] 陈湘涛, 赵美杰, 杨梅.
基于子图结构的局部社区发现算法
Overlapping Community Detection Algorithm Based on Subgraph Structure
计算机科学, 2021, 48(9): 244-250. https://doi.org/10.11896/jsjkx.201100010
[3] 宁懿昕, 谢辉, 姜火文.
图神经网络社区发现研究综述
Survey of Graph Neural Network in Community Detection
计算机科学, 2021, 48(11A): 11-16. https://doi.org/10.11896/jsjkx.210500151
[4] 张琴, 陈红梅, 封云飞.
一种基于粗糙集和密度峰值的重叠社区发现方法
Overlapping Community Detection Method Based on Rough Sets and Density Peaks
计算机科学, 2020, 47(5): 72-78. https://doi.org/10.11896/jsjkx.190400160
[5] 张琴, 陈红梅, 封云飞.
基于粗糙集和距离动态模型的重叠社区发现方法
Overlapping Community Detection Method Based on Rough Sets and Distance Dynamic Model
计算机科学, 2020, 47(10): 75-82. https://doi.org/10.11896/jsjkx.190800002
[6] 刘春, 张国良.
一种基于重叠社区发现的软件特征提取方法
Software Feature Extraction Method Based on Overlapping Community Detection
计算机科学, 2019, 46(12): 201-207. https://doi.org/10.11896/jsjkx.181001856
[7] 张忠正,李建武.
一种基于局部拓展的并行重叠社区发现算法
Parallelizable Overlapping Community Detection Algorithm Based on Local Expansion
计算机科学, 2016, 43(9): 61-65. https://doi.org/10.11896/j.issn.1002-137X.2016.09.011
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!