计算机科学 ›› 2021, Vol. 48 ›› Issue (12): 204-211.doi: 10.11896/jsjkx.210300060

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于矩阵分解的属性网络嵌入和社区发现算法

徐新黎, 肖云月, 龙海霞, 杨旭华, 毛剑飞   

  1. 浙江工业大学计算机科学与技术学院 杭州310023
  • 收稿日期:2021-03-05 修回日期:2021-06-08 出版日期:2021-12-15 发布日期:2021-11-26
  • 通讯作者: 毛剑飞(mjf@zjut.edu.cn)
  • 作者简介:xxl@zjut.edu.cn
  • 基金资助:
    国家自然科学基金项目(61773348);浙江省公益科技计划项目(LGG20F020017);浙江省自然科学基金项目(LQ18F030015)

Attributed Network Embedding Based on Matrix Factorization and Community Detection

XU Xin-li, XIAO Yun-yue, LONG Hai-xia, YANG Xu-hua, MAO Jian-fei   

  1. College of Computer Science and Technology,Zhejiang University of Technology,Hangzhou 310023,China
  • Received:2021-03-05 Revised:2021-06-08 Online:2021-12-15 Published:2021-11-26
  • About author:XU Xin-li,born in 1977,Ph.D,associate professor,is a member of China Computer Federation.Her main research interests include intelligent computing,network embedding and medical image computing.
    MAO Jian-fei,born in 1976,Ph.D,associate professor,is a member of China Computer Federation.His main research interests include network visual media and mobile media technology,computer,vision and embedded system.
  • Supported by:
    National Natural Science Foundation of China(61773348),Zhejiang Public Welfare Science and Technology Plan Project(LGG20F020017) and Natural Science Foundation of Zhejiang Province(LQ18F030015).

摘要: 属性网络不但包含节点之间复杂的拓扑结构,还包含拥有丰富属性信息的节点,其可以比传统网络更有效地建模现代信息系统,属性网络的社区划分对于分析复杂系统的层次结构、控制信息在网络中的传播和预测网络用户的群体行为等方面具有重要的研究价值。为了更好地利用拓扑结构信息和属性信息进行社区发现,提出了一种基于矩阵分解的属性网络嵌入和社区发现算法(CDEMF)。首先提出基于矩阵分解的属性网络嵌入方法,基于网络局部链接信息计算相邻节点的相似性,将其与属性接近度联合建模,通过矩阵分解的分布式算法得到每个节点对应的低维嵌入向量,即把网络节点映射为低维向量表示的数据点集合。接着提出基于曲率和模块度的社区划分方法,自动确定数据点集合中蕴含的社区数量,并通过对数据点集合聚类完成属性网络社区划分。在真实网络数据集上,将CDEMF方法与其他8种知名算法进行比较,实验结果表明CDEMF具有良好的性能。

关键词: 矩阵分解, 曲率, 社区发现, 属性网络嵌入, 自动聚类

Abstract: An attributed network contains not only the complex topological structure but also the nodes with rich attribute information.It can be used to more effectively model modern information systems than traditional networks.Community detection of the attributed network has important research value in hierarchical analysis of complex systems,control of information propagation in the network,and prediction of group behavior of network users.In order to make better use of topology information and attribute information for community discovery,an attributed network embedding based on matrix factorization and community detection(CDEMF) are proposed.First,an attributed network embedding method based on matrix factorization is proposed to model the attributed proximity and the similarity of adjacent nodes calculated in term of the local link information of the network,where the low-dimensional embedding vector corresponding to each node can be obtained by a distributed algorithm of matrix decomposition,that is,the network nodes can be mapped into a collection of data points represented by low-dimensional vectors.Then the community detection method based on curvature and modularity is developed to achieve attributed network community division by clustering the data point set,which can automatically determine the number of communities contained in the data point set.CDEMF is compared with the other 8 kinds of well-known approaches on public real network datasets.The experimental results demonstrate the effectiveness and superiority of CDEMF.

Key words: Attributed network embedding, Automatic clustering, Community detection, Curvature, Matrix factorization

中图分类号: 

  • TP391
[1]BLONDEL V D,GUILLAUME J L,LAMBIOTTE R,et al.Fast unfolding of communities in large networks[J].Journal of Statistical Mechanics:Theory and Experiment,2008(10):P10008.
[2]RAGHAVAN U N,ALBERT R,KUMARA S.Near linear time algorithm to detect community structures in large-scale networks[J].Physical Review E,2007,76(3):036106.
[3]ROZEMBERCZKI B,DAVIES R,SARKAR R,et al.GEMSEC:Graph embedding with self clustering[C]//Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.2019:65-72.
[4]SUN F Y,QU M,HOFFMANN J,et al.vGRAPH:A generative model for joint community detection and node representation learning[C]//Advances in Neural Information Processing Systems.2019:514-524.
[5]YANG J,MCAULEY J,LESKOVEC J.Community detection in networks with node attributes[C]//2013 IEEE 13th International Conference on Data Mining.IEEE,2013:1151-1156.
[6]JIN D,LIU Z Y,HE R F,et al.A Robust and Strong Explanation Community Detection Method for Attributed Networks[J].Chinese Journal of Computers,2018,41(7):1476-1489.
[7]HUANG X,LI J,HU X.Accelerated attributed network embedding[C]//Proceedings of the 2017 SIAM International Confe-rence on Data Mining.Society for Industrial and Applied Mathematics,2017:633-641.
[8]HUANG X,LI J,ZOU N,et al.A general embedding framework for heterogeneous information learning in large-scale networks[J].ACM Transactions on Knowledge Discovery from Data(TKDD),2018,12(6):1-24.
[9]LI J,HU X,WU L,et al.Robust unsupervised feature selection on networked data[C]//Proceedings of the 2016 SIAM International Conference on Data Mining.Society for Industrial and Applied Mathematics,2016:387-395.
[10]PEROZZI B,AL-RFOU R,SKIENA S.Deepwalk:Online lear- ning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.2014:701-710.
[11]TANG J,QU M,WANG M,et al.Line:Large-scale information network embedding[C]//Proceedings of the 24th International Conference on World Wide Web.2015:1067-1077.
[12]WANG H,YANG Y,LIU B.GMC:Graph-based multi-view clustering[J].IEEE Transactions on Knowledge and Data Engineering,2019,32(6):1116-1129.
[13]SUN H,HE F,HUANG J,et al.Network embedding for community detection in attributed networks[J].ACM Transactions on Knowledge Discovery from Data(TKDD),2020,14(3):1-25.
[14]YE Z L,ZHAO H X,ZHANG K,et al.Network representation learning algorithm based on multi-view integration[J].Compu-ter Science,2019,46(1):124-132.
[15]ZHANG B,YU Z,ZHANG W.Community-Centric Graph Convolutional Network for Unsupervised Community Detection[C]//IJCAI.2020.
[16]ESTER M,KRIEGEL H P,SANDER J,et al.A density-based algorithm for discovering clusters in large spatial databases with noise[C]//KDD-96 Proceedings.AAAI,1996:226-231.
[17]RODRIGUEZ A,LAIO A.Clustering by fast search and find of density peaks[J].Science,2014,344(6191):1492-1496.
[18]CHEN J F,ZHANG M,HE Q.NJW spectral clustering algorithm based on heuristics to determine the number of classes[J].Computer Science,2018,45(2):474-479.
[19]TIBSHIRANI R,WALTHER G,HASTIE T.Estimating the number of clusters in a data set via the gap statistic[J].Journal of the Royal Statistical Society:Series B(Statistical Methodology),2001,63(2):411-423.
[20]SUGAR C A,JAMES G M.Finding the number of clusters in a dataset:An information-theoretic approach[J].Journal of the American Statistical Association,2003,98(463):750-763.
[21]ZHANG Y,MAHDZIUK J,QUEK C H,et al.Curvature-based method for determining the number of clusters[J].Information Sciences,2017,415-416:414-428.
[22]YANG Y,LIU H,GUAN Z,et al.CoHomo:A cluster-attribute correlation aware graph clustering framework[J].Neurocomputing,2020,412:327-338.
[23]KUANG D,DING C,PARK H.Symmetric nonnegative matrix factorization for graph clustering[C]//Proceedings of the 2012 SIAM international conference on data mining.Society for Industrial and Applied Mathematics,2012:106-117.
[24]CALINSKI T,HARABASZ J.A dendrite method for cluster analysis[J].Communications in Statistics-theory and Methods,1974,3(1):1-27.
[25]KRZANOWSKI W J,LAI Y T.A criterion for determining the number of groups in a data set using sum-of-squares clustering[J].Biometrics,1988,44(1):23-34.
[26]NEWMAN M E J.Fast algorithm for detecting community structure in networks[J].Physical Review E,2004,69(6):066133.
[27]NEWMAN M E J.Modularity and community structure in networks[J].Proceedings of the National Academy of Sciences,2006,103(23):8577-8582.
[28]HUANG X,LI J,HU X.Label informed attributed network embedding[C]//Proceedings of the Tenth ACM International Conference on Web Search and Data Mining.2017:731-739.
[29]HUANG Z,ZHONG X,WANG Q,et al.Detecting community in attributed networks by dynamically exploring node attributes and topological structure[J].Knowledge-Based Systems,2020,196:105760.
[30]PAN Y,HU G,QIU J,et al.FLGAI:a unified network embedding framework integrating multi-scale network structures and node attribute information[J].Applied Intelligence,2020,50(11):3976-3989.
[31]GAO Y,GONG M,XIE Y,et al.Community-oriented attributed network embedding[J].Knowledge-Based Systems,2020,193:105418.
[32]YOU X,MA Y,LIU Z.A three-stage algorithm on community detection in social networks[J].Knowledge-Based Systems,2020,187:104822.
[1] 杨文坤, 原晓佩, 陈小锋, 郭睿.
三维激光雷达点云空间多特征分割
Spatial Multi-feature Segmentation of 3D Lidar Point Cloud
计算机科学, 2022, 49(8): 143-149. https://doi.org/10.11896/jsjkx.210300275
[2] 何亦琛, 毛宜军, 谢贤芬, 古万荣.
基于点割集图分割的矩阵变换与分解的推荐算法
Matrix Transformation and Factorization Based on Graph Partitioning by Vertex Separator for Recommendation
计算机科学, 2022, 49(6A): 272-279. https://doi.org/10.11896/jsjkx.210600159
[3] 王本钰, 顾益军, 彭舒凡, 郑棣文.
融合动态距离和随机竞争学习的社区发现算法
Community Detection Algorithm Based on Dynamic Distance and Stochastic Competitive Learning
计算机科学, 2022, 49(5): 170-178. https://doi.org/10.11896/jsjkx.210300206
[4] 唐春阳, 肖玉芝, 赵海兴, 冶忠林, 张娜.
面向双层网络的EWCC社区发现算法
EWCC Community Discovery Algorithm for Two-Layer Network
计算机科学, 2022, 49(4): 49-55. https://doi.org/10.11896/jsjkx.210800275
[5] 杨旭华, 王磊, 叶蕾, 张端, 周艳波, 龙海霞.
基于节点相似性和网络嵌入的复杂网络社区发现算法
Complex Network Community Detection Algorithm Based on Node Similarity and Network Embedding
计算机科学, 2022, 49(3): 121-128. https://doi.org/10.11896/jsjkx.210200009
[6] 董晓梅, 王蕊, 邹欣开.
面向推荐应用的差分隐私方案综述
Survey on Privacy Protection Solutions for Recommended Applications
计算机科学, 2021, 48(9): 21-35. https://doi.org/10.11896/jsjkx.201100083
[7] 官铮, 邓扬琳, 聂仁灿.
光谱重建约束非负矩阵分解的高光谱与全色图像融合
Non-negative Matrix Factorization Based on Spectral Reconstruction Constraint for Hyperspectral and Panchromatic Image Fusion
计算机科学, 2021, 48(9): 153-159. https://doi.org/10.11896/jsjkx.200900054
[8] 陈湘涛, 赵美杰, 杨梅.
基于子图结构的局部社区发现算法
Overlapping Community Detection Algorithm Based on Subgraph Structure
计算机科学, 2021, 48(9): 244-250. https://doi.org/10.11896/jsjkx.201100010
[9] 詹皖江, 洪植林, 方路平, 吴哲夫, 吕跃华.
基于对抗性学习的协同过滤推荐算法
Collaborative Filtering Recommendation Algorithm Based on Adversarial Learning
计算机科学, 2021, 48(7): 172-177. https://doi.org/10.11896/jsjkx.200600077
[10] 邵超, 宋淑米.
基于信任关系下用户兴趣偏好的协同过滤推荐算法
Collaborative Filtering Recommendation Algorithm Based on User Preference Under Trust Relationship
计算机科学, 2021, 48(6A): 240-245. https://doi.org/10.11896/jsjkx.200700113
[11] 段菲, 王慧敏, 张超.
面向数据表示的Cauchy非负矩阵分解
Cauchy Non-negative Matrix Factorization for Data Representation
计算机科学, 2021, 48(6): 96-102. https://doi.org/10.11896/jsjkx.200700195
[12] 郝志峰, 廖祥财, 温雯, 蔡瑞初.
基于多上下文信息的协同过滤推荐算法
Collaborative Filtering Recommendation Algorithm Based on Multi-context Information
计算机科学, 2021, 48(3): 168-173. https://doi.org/10.11896/jsjkx.200700101
[13] 宁懿昕, 谢辉, 姜火文.
图神经网络社区发现研究综述
Survey of Graph Neural Network in Community Detection
计算机科学, 2021, 48(11A): 11-16. https://doi.org/10.11896/jsjkx.210500151
[14] 张清琪, 刘漫丹.
复杂网络社区发现的多目标五行环优化算法
Multi-objective Five-elements Cycle Optimization Algorithm for Complex Network Community Discovery
计算机科学, 2020, 47(8): 284-290. https://doi.org/10.11896/jsjkx.190700082
[15] 李雨蓉, 刘杰, 刘亚林, 龚春叶, 王勇.
面向语音分离的深层转导式非负矩阵分解并行算法
Parallel Algorithm of Deep Transductive Non-negative Matrix Factorization for Speech Separation
计算机科学, 2020, 47(8): 49-55. https://doi.org/10.11896/jsjkx.190900202
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!