计算机科学 ›› 2019, Vol. 46 ›› Issue (12): 20-25.doi: 10.11896/jsjkx.190700057

• 大数据与数据科学 • 上一篇    下一篇

基于节点连接模式相关性的链接预测方法

单娜, 李龙杰, 刘昱阳, 陈晓云   

  1. (兰州大学信息科学与工程学院 兰州730000)
  • 收稿日期:2019-04-06 出版日期:2019-12-15 发布日期:2019-12-17
  • 通讯作者: 李龙杰(1981-),男,博士,工程师,硕士生导师,CCF会员,主要研究方向为数据挖掘、机器学习、网络信息挖掘等,E-mail:ljli@lzu.edu.cn。
  • 作者简介:单娜(1996-),女,硕士生,主要研究方向为网络信息挖掘;刘昱阳(1996-),男,硕士生,主要研究方向为网络信息挖掘;陈晓云(1954-),女,硕士,教授,博士生导师,CCF高级会员,主要研究方向为数据挖掘、大数据分析、网络信息挖掘等。
  • 基金资助:
    本文受国家自然科学基金(61602225),兰州大学中央高校基本科研业务费专项资金(lzujbky-2019-90)资助。

Link Prediction Based on Correlation of Nodes’ Connecting Patterns

SHAN Na, LI Long-jie, LIU Yu-yang, CHEN Xiao-yun   

  1. (School of Information Science and Engineering,Lanzhou University,Lanzhou,730000,China)
  • Received:2019-04-06 Online:2019-12-15 Published:2019-12-17

摘要: 作为复杂网络分析中的一个研究热点,链接预测在许多领域中都有重要的应用价值,得到了广泛的关注。使用网络中的已知结构信息来计算未连接的节点对之间的相似性,进而评估其存在链接的可能性是目前最常用的方法。不同网络具有不同的结构特征,节点之间的特征对链接的形成具有重要影响。为了提高链接预测的性能,文中定义了节点的连接模式,并基于节点连接模式的相关性(Correlation of Nodes’ Connecting Patterns,CNCP)设计了一个新的链接预测模型。该模型将CNCP与基本相似性指标相结合,通过综合节点的相似性与节点连接模式的相关性进行链接预测。文中将CNCP与CN(Common Neighbors),RA(Resource Allocation),AA(Adamic-Adar)及PA(Preferential Attachment)4个相似性指标相结合,提出了CNCP-CN,CNCP-RA,CNCP-AA和CNCP-PA 4个新的链接预测指标。在6个真实数据集上的实验结果表明,所提方法在AUC和Precision 2个评价标准上的性能优于对比方法。

关键词: 复杂网络, 节点连接模式相关性, 链接预测, 相似性指标

Abstract: As a research hotspot in complex network analysis,link prediction has a wide range of applications in many fields,and hence has captured much attention of researchers.Similarity-based methods,which compute the similarity scores between unconnected node pairs based on the known network structures and estimate their connection likelihood according to the similarity scores,are commonly used.In general,different kinds of networks have diverse structural characteristics,and hence the correlation of characteristics between nodes has an important influence on the formation of links.To enhance the performance of link prediction,this paper defined the connecting pattern of a node,and proposed a new link prediction model based on the Correlation of Nodes’ Connecting Patterns (CNCP).By combining CNCP with a similarity-based method,this model can take both similarity and correlation between nodes into account.In this paper,four CNCP-based methods,i.e.,CNCP-CN,CNCP-RA,CNCP-AA and CNCP-PA,are derived from the model,in which similarity indexes are CN(Common Neighbors),RA(Resource Allocation),AA(Adamic-Adar) and PA(Preferential Attachment),respectively.The experimental results on six networks show that the proposed methods are superior to the compared ones under the criteria of AUC and Precision.

Key words: Complex networks, Correlation of nodes’ connecting patterns, Link prediction, Similarity index

中图分类号: 

  • TP391
[1]LV L,ZHOU T.Link prediction in complex networks:A survey[J].Physica A:statistical mechanics and its applications,2011,390(6):1150-1170.
[2]CANNISTRACI C V,ALANIS-LOBATO G,RAVASI T.From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks[J].Scienti-fic Reports,2013,3:1613.
[3]SHERKAT E,RAHGOZAR M,ASADPOUR M.Structural link prediction based on ant colony approach in social networks[J].Physica A:Statistical Mechanics and its Applications,2015,419:80-94.
[4]LI F,HE J,HUANG G,et al.Retracted:A clustering-based link prediction method in social networks[J].Procedia Computer Science,2014,29:432-442.
[5]ZHOU T,LV L,ZHANG Y C.Predicting missing links via local information[J].The European Physical Journal B,2009,71(4):623-630.
[6]CLAUSET A,MOORE C,NEWMAN M E J.Hierarchical structure and the prediction of missing links in networks[J].Nature,2008,453(7191):98.
[7]SARUKKAI R R.Link prediction and path analysis using Markov chains[J].Computer Networks,2000,33(1-6):377-386.
[8]SALTON G,MCGILL M J.Introduction to Modern Information Retrieval [M].Auckland:McGraw-Hill,1983.
[9]SØRENSEN T A.A method of establishing groups of equal amplitude in plant sociology based on similarity of species content and its application to analyses of the vegetation on Danish commons[J].Biologiske Skrifter,1948,5(4):1-34.
[10]NEWMAN M E J.Clustering and preferential attachment in growing networks.Physical Review E,2001,64(2):025102.
[11]LV L,JIN C H,ZHOU T.Similarity index based on local paths for link prediction of complex networks[J].Physical Review E,2009,80(4):046122.
[12]KATZ L.A new status index derived from sociometric analysis[J].Psychometrika,1953,18(1):39-43.
[13]LEICHT E A,HOLME P,NEWMAN M E J.Vertex similarity in networks[J].Physical Review E,2006,73(2):026120.
[14]LIU W,LV L.Link prediction based on local random walk[J].EPL (Europhysics Letters),2010,89(5):58007.
[15]WANG T,WANG H,WANG X.CD-Based indices for link prediction in complex network[J].PloS One,2016,11(1):e0146727.
[16]ADAMIC L A,ADAR E.Friends and neighbors on the web[J].Social Networks,2003,25(3):211-230.
[17]BARABÁSI A L,ALBERT R.Emergence of scaling in random networks[J].Science,1999,286(5439):509-512.
[18]GIRVAN M,NEWMAN M M E J.Community structure in social and biological networks[J].Proceedings of the National Academy of Sciences,2002,99(12):7821-7826.
[19]KUNEGIS J.Jazz musicians network dataset[OL].http:// konect.uni-koblenz.de/networks/arenas-jazz.
[20]BAIRD D,LUCZKOVICH J,CHRISTIAN R R.Assessment of spatial and temporal variability in ecosystem attributes of the St Marks National Wildlife Refuge,Apalachee Bay,Florida[J].Estuarine,Coastal and Shelf Science,1998,47(3):329-349.
[21]ULANOWICZ R E,BONDAVALLI C,EGNOTOVICH M S. Technical Report:CBL 98-123 (1998)[R/OL].http://www.cbl.umces.edu/atlss/FBay701.html.
[22]LUSSEAU D,SCHNEIDER K,BOISSEAU O J,et al.The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations[J].Behavioral Ecology and Sociobiology,2003,54(4):396-405.
[23]GUIMERA R,DANON L,DIAZ-GUILERA A,et al.Self-similar community structure in a network of human interactions[J].Physical Review E,2003,68(6):065103.
[24]LATORA V,MARCHIORI M.Efficient behavior of small- world networks[J].Physical Review Letters,2001,87(19):198701.
[25] LICHTENWALTER R N,LUSSIER J T,CHAWLA N V.New perspectives and methods in link prediction[C]//Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2010:243-252.
[26]NEWMAN M E J.Assortative mixing in networks[J].Physical Review Letters,2002,89(20):208701.
[27]YANG J,ZHANG X D.Predicting missing links in complex networks based on common neighbors and distance[J].Scientific Reports,2016,6:38208.
[1] 宋杰, 梁美玉, 薛哲, 杜军平, 寇菲菲.
基于无监督集群级的科技论文异质图节点表示学习方法
Scientific Paper Heterogeneous Graph Node Representation Learning Method Based onUnsupervised Clustering Level
计算机科学, 2022, 49(9): 64-69. https://doi.org/10.11896/jsjkx.220500196
[2] 黄丽, 朱焱, 李春平.
基于异构网络表征学习的作者学术行为预测
Author’s Academic Behavior Prediction Based on Heterogeneous Network Representation Learning
计算机科学, 2022, 49(9): 76-82. https://doi.org/10.11896/jsjkx.210900078
[3] 郑文萍, 刘美麟, 杨贵.
一种基于节点稳定性和邻域相似性的社区发现算法
Community Detection Algorithm Based on Node Stability and Neighbor Similarity
计算机科学, 2022, 49(9): 83-91. https://doi.org/10.11896/jsjkx.220400146
[4] 何茜, 贺可太, 王金山, 林绅文, 杨菁林, 冯玉超.
比特币实体交易模式分析
Analysis of Bitcoin Entity Transaction Patterns
计算机科学, 2022, 49(6A): 502-507. https://doi.org/10.11896/jsjkx.210600178
[5] 杨波, 李远彪.
数据科学与大数据技术课程体系的复杂网络分析
Complex Network Analysis on Curriculum System of Data Science and Big Data Technology
计算机科学, 2022, 49(6A): 680-685. https://doi.org/10.11896/jsjkx.210800123
[6] 王本钰, 顾益军, 彭舒凡, 郑棣文.
融合动态距离和随机竞争学习的社区发现算法
Community Detection Algorithm Based on Dynamic Distance and Stochastic Competitive Learning
计算机科学, 2022, 49(5): 170-178. https://doi.org/10.11896/jsjkx.210300206
[7] 陈世聪, 袁得嵛, 黄淑华, 杨明.
基于结构深度网络嵌入模型的节点标签分类算法
Node Label Classification Algorithm Based on Structural Depth Network Embedding Model
计算机科学, 2022, 49(3): 105-112. https://doi.org/10.11896/jsjkx.201000177
[8] 赵学磊, 季新生, 刘树新, 李英乐, 李海涛.
基于路径连接强度的有向网络链路预测方法
Link Prediction Method for Directed Networks Based on Path Connection Strength
计算机科学, 2022, 49(2): 216-222. https://doi.org/10.11896/jsjkx.210100107
[9] 李家文, 郭炳晖, 杨小博, 郑志明.
基于信息传播的致病基因识别研究
Disease Genes Recognition Based on Information Propagation
计算机科学, 2022, 49(1): 264-270. https://doi.org/10.11896/jsjkx.201100129
[10] 穆俊芳, 郑文萍, 王杰, 梁吉业.
基于重连机制的复杂网络鲁棒性分析
Robustness Analysis of Complex Network Based on Rewiring Mechanism
计算机科学, 2021, 48(7): 130-136. https://doi.org/10.11896/jsjkx.201000108
[11] 胡军, 王雨桐, 何欣蔚, 武晖栋, 李慧嘉.
基于复杂网络的全球航空网络结构分析与应用
Analysis and Application of Global Aviation Network Structure Based on Complex Network
计算机科学, 2021, 48(6A): 321-325. https://doi.org/10.11896/jsjkx.200900112
[12] 王学光, 张爱新, 窦炳琳.
复杂网络上的非线性负载容量模型
Non-linear Load Capacity Model of Complex Networks
计算机科学, 2021, 48(6): 282-287. https://doi.org/10.11896/jsjkx.200700040
[13] 胡昕彤, 沙朝锋, 刘艳君.
基于随机投影和主成分分析的网络嵌入后处理算法
Post-processing Network Embedding Algorithm with Random Projection and Principal Component Analysis
计算机科学, 2021, 48(5): 124-129. https://doi.org/10.11896/jsjkx.200500058
[14] 马媛媛, 韩华, 瞿倩倩.
基于节点亲密度的重要性评估算法
Importance Evaluation Algorithm Based on Node Intimate Degree
计算机科学, 2021, 48(5): 140-146. https://doi.org/10.11896/jsjkx.200300184
[15] 殷子樵, 郭炳晖, 马双鸽, 米志龙, 孙怡帆, 郑志明.
群智体系网络结构的自治调节:从生物调控网络结构谈起
Autonomous Structural Adjustment of Crowd Intelligence Network: Begin from Structure of Biological Regulatory Network
计算机科学, 2021, 48(5): 184-189. https://doi.org/10.11896/jsjkx.210200161
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!