计算机科学 ›› 2019, Vol. 46 ›› Issue (3): 253-259.doi: 10.11896/j.issn.1002-137X.2019.03.038

• 人工智能 • 上一篇    下一篇

基于社区特征的平衡模块度最大化社交链接预测模型

伍杰华1,2,沈静1,周蓓1   

  1. (广东工贸职业技术学院计算机工程系 广州 510510)1
    (华南理工大学计算机科学与工程学院 广州 510641)2
  • 收稿日期:2018-02-26 修回日期:2018-05-04 出版日期:2019-03-15 发布日期:2019-03-22
  • 通讯作者: 伍杰华(1982-),男,博士生,副教授,主要研究方向为数据挖掘、社会网络分析、机器学习,E-mail:kodakwu@126.com
  • 作者简介:沈静(1980-),女,硕士,讲师,主要研究方向为数据挖掘、信息处理;周蓓(1980-),女,硕士,讲师,主要研究方向为数据挖掘。
  • 基金资助:
    广东省优秀青年教师培养计划项目(YQ2015177),广东省科技计划项目(2017ZC0348),广东高校重大科研项目与成果培育计划项目(2017GKTSCX009)资助

Community Features Based Balanced Modularity Maximization Social Link Prediction Model

WU Jie-hua1,2,SHEN Jing1,ZHOU Bei1   

  1. (Department of Computer Science,Guangdong Polytechnic of Industry and Commerce,Guangzhou 510510,China)1
    (College of Computer Science and Engineering,South China University of Technology,Guangzhou 510641,China)2
  • Received:2018-02-26 Revised:2018-05-04 Online:2019-03-15 Published:2019-03-22

摘要: 链接预测和社区发现是社交网络分析领域的两大研究方向。如何挖掘社区结构帮助提高链接预测效果具有十分重要的意义。在模块度最大化模型的基础上,提出一种基于社区结构特征提取与选择的链接预测方法。首先,在网络进化模型中引入基于社区结构的相似度指标建立局部特征,并利用影响力节点识别方法构建全局特征;然后,采用最小冗余最大相关度的特征选择算法度量特征之间的相互影响,并筛选出最有表示力的候选特征;最后,将基于经过上述步骤处理后的特征融入模块度最大化链接预测模型中。该算法在人工和真实两类数据集上与相关算法做了对比实验,结果证实了该算法的高效性,也表明了基于社区结构的特征提取与选择步骤的必要性。

关键词: 链接预测, 模块度, 社交网络, 社区特征, 特征选择

Abstract: Link prediction and community detection are the two major research directions in the field of social network analysis.It is of great significance to explore the community structure and improve the link prediction effect.Based on the modularity maximization link prediction model,this paper proposed a link prediction method based on community structure feature extraction and selection.Firstly,the community structure based similarity index and influence node identification method are introduced into the network evolution model to obtain and link the local and global features respectively.Then,the feature selection algorithm with minimum redundancy and maximum correlation is used to measure the mutual influence,andthe most expressive candidate features are filtered out.Finally,based on the above steps,the features are incorporated into the modularity maximization link prediction model.The algorithm was compared with related algorithms on both artificial and real datasets.The results verify the high efficiency of the algorithm and the necessity of feature extraction and selection based on community structure.

Key words: Community feature, Feature selection, Link prediction, Modularity, Social network

中图分类号: 

  • TP391
[1]LAN M W,LI C P,WANG S Q,et al.Survey of Sign Prediction Algorithms in Signed Social Networks[J].Journal of Computer Research and Development,2015,52(2):410-422.(in Chinese)
蓝梦微,李翠平,王绍卿,等.符号社会网络中正负关系预测算法研究综述[J].计算机研究与发展,2015,52(2):410-422.
[2]WASSERMAN S,FAUST K.Social network analysis:Methods and applications[M].New York:Cambridge University Press,1994.
[3]PHILIP S Y,HAN J,FALOUTSOS C.Link mining:Models,algorithms,and applications[M].Berlin:Springer,2010.
[4]LIBENNOWELL D,KLEINBERG J.The linkprediction problem for social networks[J].Journal of the Association for Information Science and Technology,2007,58(7):1019-1031.
[5]ZHONG E,FAN W,ZHU Y,et al.Modeling the dynamics of composite social networks[C]∥Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2013:937-945.
[6]PEROZZI B,AL-RFOU R,SKIENA S.Deepwalk:Online lear-
ning of social representations[C]∥Proceedings of the 20th ACM SIGKDD International Conference on KnowledgeDisco-very and Data Mining.ACM,2014:701-710.
[7]DONG Y,TANG J,WU S,et al.Link Prediction and Recom-
mendation across Heterogeneous Social Networks[C]∥2012 IEEE 12th International Conference on Data Mining (ICDM).IEEE,2012:181-190.
[8]SYMEONIDIS P,IAKOVIDOU N,MANTAS N,et al.From bio-
logical to social networks:Link prediction based on multi-way spectral clustering[J].Data & Knowledge Engineering,2013,87(1):226-242.
[9]SUN Y,BARBER R,GUPTA M,et al.Co-author Relationship Prediction in Heterogeneous Bibliographic Networks[C]∥International Conference on Advances in Social Networks Analysis and Mining,Asonam.DBLP,2011:121-128.
[10]Lv L,ZHOU T.Link prediction in complex networks:A survey[J].Physica A:Statistical Mechanics and Its Applications,2011,390(6):1150-1170.
[11]WU J H.TAN Model For Ties Prediction in Social Networks[J].Journal of Computer Applications,2013,33(11):3134-3137.(in Chinese)
伍杰华.基于树状朴素贝叶斯模型的社会网络关系预测[J].计算机应用,2013,33(11):3134-3137.
[12]LIU W,Lv L.Link prediction based on local random walk[J].EPL (Europhysics Letters),2010,89(5):58007.
[13]LI R H,YU J X,LIU J.Link prediction:the power of maximal entropy random walk[C]∥Proceedings of the 20th ACM International Conference on Information and Knowledge Management.ACM,2011:1147-1156.
[14]LU L,PAN L,ZHOU T,et al.Toward link predictability of
complex networks[J].Proceedings of the National Academy of Sciences,2015,112(8):2325-2330.
[15]NEWMAN M E.Communities,modules and large-scale structure in networks[J].Nature Physics,2012,8(1),25-31.
[16]YAN B,GREGORY S.Finding missing edges in networks based on their community structure[J].Physical Review E,2012,85(5):056112.
[17]SOUNDARAJAN S,HOPCROFT J.Using community information to improve the precision of link prediction methods[C]∥Proceedings of the 21st International Conference on World Wide Web.ACM,2012:607-608.
[18]CANNISTRACI C V,ALANIS-LOBATO G,RAVASI T.From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks[J].Scienti-fic Reports.2013,3(4):1613.
[19]KEMP C,TENENBAUM J B,GRIFFITHS T L,et al.Learning systems of concepts with an infinite relational model[C]∥National Conference on Artificial Intelligence.AAAI Press,2006:381-388.
[20]PALLA K,KNOWLES D A,GHAHRAMANI Z.An infinite latent attribute model for network data[C]∥Proceedings of the 29th International Coference on International Conference on Machine Learning.Omnipress,2012:395-402.
[21]WU J,ZHANG G,REN Y.A balanced modularity maximization link prediction model in social networks[J].Information Processing & Management,2017,53(1):295-307.
[22]L L,CHEN D,REN X L,et al.Vital nodes identification in complex networks[J].Physics Reports,2016,650(1):1-63.
[23]GU Y R,ZHU Z Y.Node Ranking in Complex Networks Based on LeaderRank and Modes Similaritya[J].Journal of University of Electronic Science and Technology of China,2017,46(2):441-448.(in Chinese)
顾亦然,朱梓嫣.基于LeaderRank和节点相似度的复杂网络重要节点排序算法[J].电子科技大学学报,2017,46(2):441-448.
[24]CHEN D B,GAO H,L L,et al.Identifying influential nodes in large-scale directed networks:the role of clustering[J].PloS one,2013,8(10):e77455.
[25]GUYON I,ELISSEEFF A.An introduction to variable and feature selection[J].Journal of Machine Learning Research,2003,3(1):1157-1182.
[26]PENG H,LONG F,DING C.Feature Selection Based on Mutual Information:Criteria of Max-Dependency,Max-Relevance,and Min-Redundancy[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2005,27(8):1226-1238.
[27]BARABASI A L,ALBERT R.Emergence of scaling in random networks[J].Science,1999,286(5439):509-512.
[28]DOROGOVTSEV S N,MENDES J F.Evolution of networks[J].Advances in Physics,2002,51(4):1079-1187.
[29]SALES-PARDO M,GUIMERA R,MOREIRA A A,et al.Extracting the hierarchical organization of complex systems[J].Proceedings of the National Academy of Sciences,2007,104(39):15224-15229.
[30]GIRVAN M,NEWMAN M E J.Community structure in social and biological networks [J].Proceedings of the National Academy of Sciences,2002,99(12):7821-7826.
[31]BLONDEL V D,GUILLAUME J L,LAMBIOTTE R,et al.
Fast unfolding of communities in large networks[J].Journal of Statistical Mechanics Theory & Experiment,2008,2008(10):155-168.
[1] 宋杰, 梁美玉, 薛哲, 杜军平, 寇菲菲.
基于无监督集群级的科技论文异质图节点表示学习方法
Scientific Paper Heterogeneous Graph Node Representation Learning Method Based onUnsupervised Clustering Level
计算机科学, 2022, 49(9): 64-69. https://doi.org/10.11896/jsjkx.220500196
[2] 黄丽, 朱焱, 李春平.
基于异构网络表征学习的作者学术行为预测
Author’s Academic Behavior Prediction Based on Heterogeneous Network Representation Learning
计算机科学, 2022, 49(9): 76-82. https://doi.org/10.11896/jsjkx.210900078
[3] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[4] 李斌, 万源.
基于相似度矩阵学习和矩阵校正的无监督多视角特征选择
Unsupervised Multi-view Feature Selection Based on Similarity Matrix Learning and Matrix Alignment
计算机科学, 2022, 49(8): 86-96. https://doi.org/10.11896/jsjkx.210700124
[5] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[6] 康雁, 王海宁, 陶柳, 杨海潇, 杨学昆, 王飞, 李浩.
混合改进的花授粉算法与灰狼算法用于特征选择
Hybrid Improved Flower Pollination Algorithm and Gray Wolf Algorithm for Feature Selection
计算机科学, 2022, 49(6A): 125-132. https://doi.org/10.11896/jsjkx.210600135
[7] 魏鹏, 马玉亮, 袁野, 吴安彪.
用户行为驱动的时序影响力最大化问题研究
Study on Temporal Influence Maximization Driven by User Behavior
计算机科学, 2022, 49(6): 119-126. https://doi.org/10.11896/jsjkx.210700145
[8] 余皑欣, 冯秀芳, 孙静宇.
结合物品相似性的社交信任推荐算法
Social Trust Recommendation Algorithm Combining Item Similarity
计算机科学, 2022, 49(5): 144-151. https://doi.org/10.11896/jsjkx.210300217
[9] 畅雅雯, 杨波, 高玥琳, 黄靖云.
基于SEIR的微信公众号信息传播建模与分析
Modeling and Analysis of WeChat Official Account Information Dissemination Based on SEIR
计算机科学, 2022, 49(4): 56-66. https://doi.org/10.11896/jsjkx.210900169
[10] 左园林, 龚月姣, 陈伟能.
成本受限条件下的社交网络影响最大化方法
Budget-aware Influence Maximization in Social Networks
计算机科学, 2022, 49(4): 100-109. https://doi.org/10.11896/jsjkx.210300228
[11] 储安琪, 丁志军.
基于灰狼优化算法的信用评估样本均衡化与特征选择同步处理
Application of Gray Wolf Optimization Algorithm on Synchronous Processing of Sample Equalization and Feature Selection in Credit Evaluation
计算机科学, 2022, 49(4): 134-139. https://doi.org/10.11896/jsjkx.210300075
[12] 孙林, 黄苗苗, 徐久成.
基于邻域粗糙集和Relief的弱标记特征选择方法
Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief
计算机科学, 2022, 49(4): 152-160. https://doi.org/10.11896/jsjkx.210300094
[13] 郭磊, 马廷淮.
基于好友亲密度的用户匹配
Friend Closeness Based User Matching
计算机科学, 2022, 49(3): 113-120. https://doi.org/10.11896/jsjkx.210200137
[14] 李宗然, 陈秀宏, 陆赟, 邵政毅.
鲁棒联合稀疏不相关回归
Robust Joint Sparse Uncorrelated Regression
计算机科学, 2022, 49(2): 191-197. https://doi.org/10.11896/jsjkx.210300034
[15] 张叶, 李志华, 王长杰.
基于核密度估计的轻量级物联网异常流量检测方法
Kernel Density Estimation-based Lightweight IoT Anomaly Traffic Detection Method
计算机科学, 2021, 48(9): 337-344. https://doi.org/10.11896/jsjkx.200600108
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!