计算机科学 ›› 2020, Vol. 47 ›› Issue (10): 75-82.doi: 10.11896/jsjkx.190800002

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于粗糙集和距离动态模型的重叠社区发现方法

张琴, 陈红梅, 封云飞   

  1. 西南交通大学信息科学与技术学院 成都611756
    西南交通大学云计算与智能技术高校重点实验室 成都611756
  • 收稿日期:2019-07-31 修回日期:2019-11-11 出版日期:2020-10-15 发布日期:2020-10-16
  • 通讯作者: 陈红梅(hmchen@swjtu.edu.cn)
  • 作者简介:qinzhang@my.swjtu.edu.cn
  • 基金资助:
    国家自然科学基金(61572406,61976182);四川省国际科技创新合作重点项目(2019YFH0097)

Overlapping Community Detection Method Based on Rough Sets and Distance Dynamic Model

ZHANG Qin, CHEN Hong-mei, FENG Yun-fei   

  1. School of Information Science and Technology,Southwest Jiaotong University,Chengdu 611756,China
    Key Laboratory of Cloud Computing and Intelligent Technology,Southwest Jiaotong University,Chengdu 611756,China
  • Received:2019-07-31 Revised:2019-11-11 Online:2020-10-15 Published:2020-10-16
  • About author:ZHANG Qin,born in 1995,postgra-duate,is a member of China Computer Federation.Her main research interests include database technology and data mining.
    CHEN Hong-mei,born in 1971,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.Her main research interests include granular calculation,rough sets and intelligent information processing.
  • Supported by:
    National Natural Science Foundation of China (61572406,61976182) and Key Program for International Science and Technology Innovation Cooperation of Sichuan Province,China (2019YFH0097)

摘要: 现实世界可被看作由许多不同的复杂系统组成。为了建模分析复杂系统中个体间隐藏的规律及功能,将复杂系统抽象为由节点和边组成的复杂网络。挖掘复杂网络中的社区结构在内容推荐、行为预测和疾病扩散等方面具有重要的理论意义和实际价值。随着复杂系统内个体的不断变化,多个社区间出现了重叠节点,有效且准确地挖掘社区中的重叠节点具有一定的挑战性。为了有效发现社区中的重叠节点,提出了一种基于粗糙集和距离动态模型的重叠社区发现方法(Overlapping Community Detection based on Rough sets and Distance Dynamics model,OCDRDD)。该方法首先根据网络的拓扑结构,结合节点度中心性和距离选出K个核心节点;然后按照定义的距离比率关系初始化社区的近似集和边界域,结合距离动态模型,迭代变化边界域节点与下近似集节点间相连的边的距离,且在每次迭代过程中将符合定义的距离比率关系的边界域节点划分到社区下近似集中,以缩小边界域节点(即缩小边界域的范围),直到找到最佳重叠社区结构;最后根据定义的两条规则处理“伪”重叠节点。在真实网络数据集和LFR Benchmark人工网络数据集上,以NMI和具有重叠性的模块度EQ作为评价指标,将OCDRDD方法与近几年具有代表性的社区发现方法进行实验测试比较,发现OCDRDD方法整体优于其他算法,结果表明该算法具有有效性和可行性。

关键词: 边界域, 粗糙集, 距离动态模型, 重叠社区发现

Abstract: The real world is considered to be composed of many different complex systems.In order to model and analyze the hidden rules and functions among individuals in complex systems,the complex system may be abstracted as a complex network composed of nodes and edges.Mining community structures in complex networks has important theoretical significance and practical value in content recommendation,behavior prediction and disease spread.With the continuous changes of individuals in complex systems,overlapping nodes appear among multiple communities.How to effectively and accurately mine the overlapping nodes in communities has brought some challenges.In order to effectively detect the overlapping nodes in the community,an overlapping community detection method based on rough sets and distance dynamic model (OCDRDD) is proposed in this paper.First of all,according to the topology of the network,it selects K core nodes by combining node degree centrality and distance,then initializes the approximation sets and the boundary region of the community according to the distance ratio relationship.Combined with the distance dynamic model,the distances between boundary region nodes and the lower approximation set nodes are changed iteratively.During each iteration,boundary region nodes that conform to the defined distance ratio relationship are classified into the lower approximation of the community,and the boundary region nodes are reduced until the optimal overlapping community structure is found.Finally,the “pseudo” overlapping nodes are processed according to the two rules defined in this paper.NMI and overlapping module degree EQ are taken as evaluation indexes on real network datasets and LFR Benchmark artificial network datasets.The OCDRDD method is compared with other typical community detection methods in recent years both on real network datasets and LFR Benchmark artificial network datasets.The experimental results show that OCDRDD method is better than other community detection algorithms on the whole.The results show that the proposed algorithm is effective and feasible.

Key words: Boundary region, Distance dynamic model, Overlapping community detection, Rough set

中图分类号: 

  • TP391
[1]COSCIA M,GIANNOTTI F,PEDRESCHI D.A classificationfor community discovery methods in complex networks[J].Statistical Analysis & Data Mining the Asa Data Science Journal,2011,4(5):512-546.
[2]KELLEY S,GOLDBERG M,MAGDON-ISMAIL M,et al.Defining and discovering communities in social networks[M] // Thai M T,Pardalos P M.Handbook of Optimization in Complex Network.New York:Springer,2012:139-168.
[3]CUI W,XIAO Y,WANG H,et al.Local search of communities in large graphs[C]//ACM SIGMOD International Conference on Management.2014:991-1002.
[4]XIE J,KELLEY S,SZYMANSKI B K.Overlapping community detection in networks:the State of the Art and Comparative Study[J].ACM Computing Surveys,2011,45(4):1-35.
[5]BAI X,YANG P,SHI X,et al.An overlapping community detection algorithm based on density peaks[J].Neurocomputing,2016,226:7-15.
[6]ZHOU X,LIU Y,WANG J,et al.A density based link clustering algorithm for overlapping community detection in networks[J].Physica A Statistical Mechanics & Its Applications,2017,486:65-78.
[7]SUN H,LIU J,HUANG J,et al.A link-based label propagation algorithm for overlapping community detection in networks[J].Computational Intelligence,2016,33(2):308-331.
[8]WU H,GAO L,DONG J,et al.Detecting overlapping protein complexes by rough-fuzzy clustering in protein-protein interaction networks[J].Plos One,2013,9(3):e91856.
[9]SUN Z,WANG B,SHENG J,et al.Overlapping community detection based on information dynamics[J].IEEE Access,2018,6:70919-70934.
[10]SHAO J,HAN Z,YANG Q,et al.Community detection based on distance dynamics[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2015:1075-1084.
[11]CHEN L,ZHANG J,CAI L J.Overlapping community detection based on link graph using distance dynamics[J].International Journal of Modern Physics B,2017,32(3):1850015.
[12]PAWLAK Z.Rough sets[J].InternationalJournal of Computer &Information Sciences,1982,11(5):341-356.
[13]ZHANG Y,WU B,LIU Y,et al.A novel community detection method based on rough set K-means[J].Journal of Electronics &Information Technology,2017,39(4):770-777.
[14]DWYER T.Visual analysis of network centralities[C]//Proceedings of Asia-Pacific Symposium on Information Visualization (APVIS 2006).2006:189-197.
[15]HENNIG C,HAUSDORF B.Design of dissimilarity measures:a new dissimilarity between species distribution areas[C]//Data Science and Classification.Berlin:Springer,2006:29-37.
[16]HUANG L,WANG G,WANG Y,et al.A link density clustering algorithm based on automatically selecting density peaks for overlapping community detection[J].International Journal of Modern Physics B,2016,30(24):15.
[17]DING J,HE X,YUAN J,et al.Community detection by propagating the label of center[J].Physica A:Statistical Mechanics and its Applications,2018,503:675-686.
[18]NEWMAN M.Network data[EB/OL].[2013-04-19].http://www personal.umich.edu/~mejn/netdata/.
[19]LANCICHINETTI A,FORTUNATO S.Community detectionalgorithms:A comparative analysis[J].Physical Review E,2009,80(5):056117.
[20]ZACHARY W.Information-flow model for conflict and fission in small groups[J].Journal of Anthropological Research,1977,33:452-473.
[21]LUSSEAU D,SCHNEIDER K,BOISSEAU O,et al.The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations[J].Behavioral Ecology & Sociobiology,2003,54(4):396-405.
[22]KREBS V.Books about US Politics[EB/OL].http://www.orgnet.com/.
[23]GIRVAN M,NEWMAN M E J.Community structure in social and biological networks[EB/OL].http://arxiv.org/abs/cond-mat/0112110/.
[24]ADAMIC L A,GLANCE N.The political blogosphere and the 2004 US Election[C]//Proceedings of the WWW-2005 Workshop on the Weblogging Ecosystem.2005.
[25]KNUTH D E.The stanford graphBase:A platform for combinatorial computing[M].Addison-Wesley,1993.
[26]ZHU M,MENG F R,ZHOU Y.Density-based link clusteringalgorithm for overlapping community detection[J].Journal of Computer Research and Development.,2013,50(12):2520-2530.
[27]LANCICHINETTI A,FORTUNATO S,KERTÉSZ J.Detec-ting the overlapping and hierarchical community structure in complex networks[J].New Journal of Physics,2009,11(3):033015.
[28]NICOSIA V,MANGIONI G,CARCHIOLO V,et al.Extending modularity definition for directed graphs with overlapping communities[OL].https://wenku.baidu.com/view/f1432a2d0066f5335a81219b.html.
[1] 程富豪, 徐泰华, 陈建军, 宋晶晶, 杨习贝.
基于顶点粒k步搜索和粗糙集的强连通分量挖掘算法
Strongly Connected Components Mining Algorithm Based on k-step Search of Vertex Granule and Rough Set Theory
计算机科学, 2022, 49(8): 97-107. https://doi.org/10.11896/jsjkx.210700202
[2] 许思雨, 秦克云.
基于剩余格的模糊粗糙集的拓扑性质
Topological Properties of Fuzzy Rough Sets Based on Residuated Lattices
计算机科学, 2022, 49(6A): 140-143. https://doi.org/10.11896/jsjkx.210200123
[3] 方连花, 林玉梅, 吴伟志.
随机多尺度序决策系统的最优尺度选择
Optimal Scale Selection in Random Multi-scale Ordered Decision Systems
计算机科学, 2022, 49(6): 172-179. https://doi.org/10.11896/jsjkx.220200067
[4] 陈于思, 艾志华, 张清华.
基于三角不等式判定和局部策略的高效邻域覆盖模型
Efficient Neighborhood Covering Model Based on Triangle Inequality Checkand Local Strategy
计算机科学, 2022, 49(5): 152-158. https://doi.org/10.11896/jsjkx.210300302
[5] 孙林, 黄苗苗, 徐久成.
基于邻域粗糙集和Relief的弱标记特征选择方法
Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief
计算机科学, 2022, 49(4): 152-160. https://doi.org/10.11896/jsjkx.210300094
[6] 王子茵, 李磊军, 米据生, 李美争, 解滨.
基于误分代价的变精度模糊粗糙集属性约简
Attribute Reduction of Variable Precision Fuzzy Rough Set Based on Misclassification Cost
计算机科学, 2022, 49(4): 161-167. https://doi.org/10.11896/jsjkx.210500211
[7] 王志成, 高灿, 邢金明.
一种基于正域的三支近似约简
Three-way Approximate Reduction Based on Positive Region
计算机科学, 2022, 49(4): 168-173. https://doi.org/10.11896/jsjkx.210500067
[8] 薛占熬, 侯昊东, 孙冰心, 姚守倩.
带标记的不完备双论域模糊概率粗糙集中近似集动态更新方法
Label-based Approach for Dynamic Updating Approximations in Incomplete Fuzzy Probabilistic Rough Sets over Two Universes
计算机科学, 2022, 49(3): 255-262. https://doi.org/10.11896/jsjkx.201200042
[9] 陈湘涛, 赵美杰, 杨梅.
基于子图结构的局部社区发现算法
Overlapping Community Detection Algorithm Based on Subgraph Structure
计算机科学, 2021, 48(9): 244-250. https://doi.org/10.11896/jsjkx.201100010
[10] 李艳, 范斌, 郭劼, 林梓源, 赵曌.
基于k-原型聚类和粗糙集的属性约简方法
Attribute Reduction Method Based on k-prototypes Clustering and Rough Sets
计算机科学, 2021, 48(6A): 342-348. https://doi.org/10.11896/jsjkx.201000053
[11] 宁懿昕, 谢辉, 姜火文.
图神经网络社区发现研究综述
Survey of Graph Neural Network in Community Detection
计算机科学, 2021, 48(11A): 11-16. https://doi.org/10.11896/jsjkx.210500151
[12] 薛占熬, 孙冰心, 侯昊东, 荆萌萌.
基于多粒度粗糙直觉犹豫模糊集的最优粒度选择方法
Optimal Granulation Selection Method Based on Multi-granulation Rough Intuitionistic Hesitant Fuzzy Sets
计算机科学, 2021, 48(10): 98-106. https://doi.org/10.11896/jsjkx.200800074
[13] 薛占熬, 张敏, 赵丽平, 李永祥.
集对优势关系下多粒度决策粗糙集的可变三支决策模型
Variable Three-way Decision Model of Multi-granulation Decision Rough Sets Under Set-pair Dominance Relation
计算机科学, 2021, 48(1): 157-166. https://doi.org/10.11896/jsjkx.191200175
[14] 桑彬彬, 杨留中, 陈红梅, 王生武.
优势关系粗糙集增量属性约简算法
Incremental Attribute Reduction Algorithm in Dominance-based Rough Set
计算机科学, 2020, 47(8): 137-143. https://doi.org/10.11896/jsjkx.190700188
[15] 陈玉金, 徐吉辉, 史佳辉, 刘宇.
基于直觉犹豫模糊集的三支决策模型及其应用
Three-way Decision Models Based on Intuitionistic Hesitant Fuzzy Sets and Its Applications
计算机科学, 2020, 47(8): 144-150. https://doi.org/10.11896/jsjkx.190800041
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!