计算机科学 ›› 2020, Vol. 47 ›› Issue (10): 121-125.doi: 10.11896/jsjkx.191000099

• 数据库&大数据&数据科学 • 上一篇    下一篇

结合社区嵌入和节点嵌入的社区发现算法

赵霞1, 李娴1, 张泽华1, 张晨威2   

  1. 1 太原理工大学信息与计算机学院 山西 晋中030600
    2 伊利诺伊大学芝加哥分校计算机科学学院 芝加哥60607
  • 收稿日期:2019-10-15 修回日期:2019-12-16 出版日期:2020-10-15 发布日期:2020-10-16
  • 通讯作者: 张泽华(zehua_zhang@163.com)
  • 作者简介:zhaoxiazzzz@163.com
  • 基金资助:
    国家自然科学基金项目(61503273,61702356);太原理工大学青年创新团队项目(2014TD056)

Community Detection Algorithm Combing Community Embedding and Node Embedding

ZHAO Xia1, LI Xian1, ZHANG Ze-hua1, ZHANG Chen-wei2   

  1. 1 College of Information and Computer,Taiyuan University of Technology,Jinzhong,Shanxi 030600,China
    2 School of Computer Science,University of Illinois at Chicago,Chicago 60607,USA
  • Received:2019-10-15 Revised:2019-12-16 Online:2020-10-15 Published:2020-10-16
  • About author:ZHAO Xia,born in 1994,postgraduate.Her main research interests includeuncertainty theory and social network applications.
    ZHANG Ze-hua,born in 1981,Ph.D,master supervisor,is a member of China Computer Federation.His main research interests include granular computing,uncertain reasoning and know-ledge discovery on graph.
  • Supported by:
    ational Nature Science Foundation of China (61503273,61702356) and Youth Innovation Team Project of Taiyuan University of Technology (2014TD056)

摘要: 社区作为社交网络的重要属性,对理解网络功能和预测演化有着重要作用。通过网络嵌入将网络节点转化成低维稠密的特征向量,并将其应用于社区发现等机器学习任务,是近年来的研究热点。传统的网络嵌入方法仅针对节点嵌入,忽略了社区嵌入的重要性。针对这样的问题,提出了将社区嵌入和改进的节点嵌入相结合的方法CNE,从而获得融合结构信息和属性信息的节点表示。节点嵌入将节点表示为低维向量,类似地,社区嵌入把社区表示为低维空间中的高斯分布,二者将多种节点相似性相结合,互相促进,从而获得更为准确的社区发现结果。在公开数据集上将所提算法与传统的社区发现算法和网络嵌入方法进行比较,实验结果表明提出的CNE方法具有更高的精度。

关键词: 社交网络, 社区发现, 社区嵌入, 网络嵌入

Abstract: As an important property of social networks,community plays an important role in understanding network functions and predicting evolution.It is a research hotspot in recent years to transform network nodes into low-dimensional dense feature vectors through network embedding and apply them to machine learning tasks such as community detection.The traditional network embedding method only focuses on node embedding and ignores the importance of community embedding.Aiming at such a problem,CNE,a method combining Community embedding and improved Node Embedding,is proposed to obtain node representation combining structure information and attribute information.Node embedding represents nodes as low-dimensional vectors.Similarly,community embedding represents communities as Gaussian distributions in low-dimensional spaces.They combine multiple node similarities to promote more accurate community detection results.The experimental results show that,compared with the traditional community detection algorithm and network embedding method on public datasets,the proposed CNE method has higher precision.

Key words: Community detection, Community embedding, Network embedding, Social network

中图分类号: 

  • TP181
[1]XIE J,KELLEY S,SZYMANSKI B K.Overlapping community detection in networks:The state-of-the-art and comparative study[J].Acm Computing Surveys (csur),2013,45(4):43.
[2]Newman M E J.Fast algorithm for detecting community structure in networks[J].Physical Review E,2004,69(6):066133.
[3]SHI P,HE K,BINDEL D,et al.Local lanczos spectral approximation for community detection[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases.Berlin:Springer,2017:651-667.
[4]ZHANG X Q,AN X D,CAO F Y.Detecting Community from Bipartite Network Based on Spectral Clustering[J].Computer Science,2019,46(4):216-221.
[5]HAN Y,TANG J.Probabilistic community and role model forsocial networks[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM,2015:407-416.
[6]CUI P,WANG X,PEI J,et al.A survey on network embedding[J].IEEE Transactions on Knowledge and Data Engineering,2018,31(5):833-852.
[7]ZHENG V W,CAVALLARI S,CAI H,et al.From Node Embedding To Community Embedding[J].arXiv:1610.09950,2016.
[8]BISHOP C M.Pattern Recognition and Machine Learning (Information Science and Statistics)[M].New York:Springer-Verlag,2007.
[9]MIKOLOV T,CHEN K,CORRADO G S,et al.Efficient Estimation of Word Representations in Vector Space[J].arXiv:1301.3781,2013.
[10]CHEN H,PEROZZI B,AL-RFOU R,et al.A tutorial on network embeddings[J].arXiv:1808.02590,2018.
[11]PEROZZI B,AL-RFOU R,SKIENA S.Deepwalk:Online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM,2014:701-710.
[12]GROVER A,LESKOVEC J.Node2vec:Scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM,2016:855-864.
[13]TANG J,QU M,WANG M,et al.Line:Large-scale information network embedding[C]//Proceedings of the 24th International Conference on World Wide Web.New York:ACM,2015:1067-1077.
[14]ROWEIS S T,SAUL L K.Nonlinear dimensionality reduction by locally linear embedding[J].Science,2000,290(5500):2323-2326.
[15]TENENBAUM J B,DE SILVA V,LANGFORD J C.A global geometric framework for nonlinear dimensionality reduction[J].Science,2000,290(5500):2319-2323.
[16]WANG D X,CUI P,ZHU W W.Structural Deep Network Embedding[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM,2016:1225-1234.
[17]CAO S,LU W,XU Q.Grarep:Learning graph representations with global structural information[C]//Proceedings of the 24th ACM International on Conference on Information and Knowledge Management.New York:ACM,2015:891-900.
[18]GOLUB G H,VAN LOAN C F.Matrix computations[M].Baltimore:JHU press,2012.
[19]DEMPSTER A P,LAIRD N M,RUBIN D B.Maximum likelihood from incomplete data via the EM algorithm[J].Journal of the Royal Statistical Society:Series B (Methodological),1977,39(1):1-22.
[20]MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[C]//Advances in Neural Information Processing Systems.New York:ACM,2013:3111-3119.
[21]WHITLEY D,STARKWEATHER T,Bogart C.Genetic algorithms and neural networks:Optimizing connections and connectivity[J].Parallel Computing,1990,14(3):347-361.
[22]BLONDEL V,GUILLAUME J L,LAMBIOTTE R,et al.Fast unfolding of communities in large networks[J].Journal of Statistical Mechanics:Theory and Experiment,2008,2008(10):P10008.
[1] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[2] 何亦琛, 毛宜军, 谢贤芬, 古万荣.
基于点割集图分割的矩阵变换与分解的推荐算法
Matrix Transformation and Factorization Based on Graph Partitioning by Vertex Separator for Recommendation
计算机科学, 2022, 49(6A): 272-279. https://doi.org/10.11896/jsjkx.210600159
[3] 魏鹏, 马玉亮, 袁野, 吴安彪.
用户行为驱动的时序影响力最大化问题研究
Study on Temporal Influence Maximization Driven by User Behavior
计算机科学, 2022, 49(6): 119-126. https://doi.org/10.11896/jsjkx.210700145
[4] 余皑欣, 冯秀芳, 孙静宇.
结合物品相似性的社交信任推荐算法
Social Trust Recommendation Algorithm Combining Item Similarity
计算机科学, 2022, 49(5): 144-151. https://doi.org/10.11896/jsjkx.210300217
[5] 王本钰, 顾益军, 彭舒凡, 郑棣文.
融合动态距离和随机竞争学习的社区发现算法
Community Detection Algorithm Based on Dynamic Distance and Stochastic Competitive Learning
计算机科学, 2022, 49(5): 170-178. https://doi.org/10.11896/jsjkx.210300206
[6] 唐春阳, 肖玉芝, 赵海兴, 冶忠林, 张娜.
面向双层网络的EWCC社区发现算法
EWCC Community Discovery Algorithm for Two-Layer Network
计算机科学, 2022, 49(4): 49-55. https://doi.org/10.11896/jsjkx.210800275
[7] 畅雅雯, 杨波, 高玥琳, 黄靖云.
基于SEIR的微信公众号信息传播建模与分析
Modeling and Analysis of WeChat Official Account Information Dissemination Based on SEIR
计算机科学, 2022, 49(4): 56-66. https://doi.org/10.11896/jsjkx.210900169
[8] 左园林, 龚月姣, 陈伟能.
成本受限条件下的社交网络影响最大化方法
Budget-aware Influence Maximization in Social Networks
计算机科学, 2022, 49(4): 100-109. https://doi.org/10.11896/jsjkx.210300228
[9] 陈世聪, 袁得嵛, 黄淑华, 杨明.
基于结构深度网络嵌入模型的节点标签分类算法
Node Label Classification Algorithm Based on Structural Depth Network Embedding Model
计算机科学, 2022, 49(3): 105-112. https://doi.org/10.11896/jsjkx.201000177
[10] 郭磊, 马廷淮.
基于好友亲密度的用户匹配
Friend Closeness Based User Matching
计算机科学, 2022, 49(3): 113-120. https://doi.org/10.11896/jsjkx.210200137
[11] 杨旭华, 王磊, 叶蕾, 张端, 周艳波, 龙海霞.
基于节点相似性和网络嵌入的复杂网络社区发现算法
Complex Network Community Detection Algorithm Based on Node Similarity and Network Embedding
计算机科学, 2022, 49(3): 121-128. https://doi.org/10.11896/jsjkx.210200009
[12] 郑苏苏, 关东海, 袁伟伟.
融合不完整多视图的异质信息网络嵌入方法
Heterogeneous Information Network Embedding with Incomplete Multi-view Fusion
计算机科学, 2021, 48(9): 68-76. https://doi.org/10.11896/jsjkx.210500203
[13] 陈湘涛, 赵美杰, 杨梅.
基于子图结构的局部社区发现算法
Overlapping Community Detection Algorithm Based on Subgraph Structure
计算机科学, 2021, 48(9): 244-250. https://doi.org/10.11896/jsjkx.201100010
[14] 王剑, 王玉翠, 黄梦杰.
社交网络中的虚假信息:定义、检测及控制
False Information in Social Networks:Definition,Detection and Control
计算机科学, 2021, 48(8): 263-277. https://doi.org/10.11896/jsjkx.210300053
[15] 谭琪, 张凤荔, 王婷, 王瑞锦, 周世杰.
融入结构度中心性的社交网络用户影响力评估算法
Social Network User Influence Evaluation Algorithm Integrating Structure Centrality
计算机科学, 2021, 48(7): 124-129. https://doi.org/10.11896/jsjkx.200600096
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!