计算机科学 ›› 2019, Vol. 46 ›› Issue (12): 20-25.doi: 10.11896/jsjkx.190700057

• 大数据与数据科学 • 上一篇    下一篇

基于节点连接模式相关性的链接预测方法

单娜, 李龙杰, 刘昱阳, 陈晓云   

  1. (兰州大学信息科学与工程学院 兰州730000)
  • 收稿日期:2019-04-06 出版日期:2019-12-15 发布日期:2019-12-17
  • 通讯作者: 李龙杰(1981-),男,博士,工程师,硕士生导师,CCF会员,主要研究方向为数据挖掘、机器学习、网络信息挖掘等,E-mail:ljli@lzu.edu.cn。
  • 作者简介:单娜(1996-),女,硕士生,主要研究方向为网络信息挖掘;刘昱阳(1996-),男,硕士生,主要研究方向为网络信息挖掘;陈晓云(1954-),女,硕士,教授,博士生导师,CCF高级会员,主要研究方向为数据挖掘、大数据分析、网络信息挖掘等。
  • 基金资助:
    本文受国家自然科学基金(61602225),兰州大学中央高校基本科研业务费专项资金(lzujbky-2019-90)资助。

Link Prediction Based on Correlation of Nodes’ Connecting Patterns

SHAN Na, LI Long-jie, LIU Yu-yang, CHEN Xiao-yun   

  1. (School of Information Science and Engineering,Lanzhou University,Lanzhou,730000,China)
  • Received:2019-04-06 Online:2019-12-15 Published:2019-12-17

摘要: 作为复杂网络分析中的一个研究热点,链接预测在许多领域中都有重要的应用价值,得到了广泛的关注。使用网络中的已知结构信息来计算未连接的节点对之间的相似性,进而评估其存在链接的可能性是目前最常用的方法。不同网络具有不同的结构特征,节点之间的特征对链接的形成具有重要影响。为了提高链接预测的性能,文中定义了节点的连接模式,并基于节点连接模式的相关性(Correlation of Nodes’ Connecting Patterns,CNCP)设计了一个新的链接预测模型。该模型将CNCP与基本相似性指标相结合,通过综合节点的相似性与节点连接模式的相关性进行链接预测。文中将CNCP与CN(Common Neighbors),RA(Resource Allocation),AA(Adamic-Adar)及PA(Preferential Attachment)4个相似性指标相结合,提出了CNCP-CN,CNCP-RA,CNCP-AA和CNCP-PA 4个新的链接预测指标。在6个真实数据集上的实验结果表明,所提方法在AUC和Precision 2个评价标准上的性能优于对比方法。

关键词: 复杂网络, 链接预测, 节点连接模式相关性, 相似性指标

Abstract: As a research hotspot in complex network analysis,link prediction has a wide range of applications in many fields,and hence has captured much attention of researchers.Similarity-based methods,which compute the similarity scores between unconnected node pairs based on the known network structures and estimate their connection likelihood according to the similarity scores,are commonly used.In general,different kinds of networks have diverse structural characteristics,and hence the correlation of characteristics between nodes has an important influence on the formation of links.To enhance the performance of link prediction,this paper defined the connecting pattern of a node,and proposed a new link prediction model based on the Correlation of Nodes’ Connecting Patterns (CNCP).By combining CNCP with a similarity-based method,this model can take both similarity and correlation between nodes into account.In this paper,four CNCP-based methods,i.e.,CNCP-CN,CNCP-RA,CNCP-AA and CNCP-PA,are derived from the model,in which similarity indexes are CN(Common Neighbors),RA(Resource Allocation),AA(Adamic-Adar) and PA(Preferential Attachment),respectively.The experimental results on six networks show that the proposed methods are superior to the compared ones under the criteria of AUC and Precision.

Key words: Complex networks, Link prediction, Correlation of nodes’ connecting patterns, Similarity index

中图分类号: 

  • TP391
[1] LV L,ZHOU T.Link prediction in complex networks:A survey[J].Physica A:statistical mechanics and its applications,2011,390(6):1150-1170.
[2] CANNISTRACI C V,ALANIS-LOBATO G,RAVASI T.From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks[J].Scienti-fic Reports,2013,3:1613.
[3] SHERKAT E,RAHGOZAR M,ASADPOUR M.Structural link prediction based on ant colony approach in social networks[J].Physica A:Statistical Mechanics and its Applications,2015,419:80-94.
[4] LI F,HE J,HUANG G,et al.Retracted:A clustering-based link prediction method in social networks[J].Procedia Computer Science,2014,29:432-442.
[5] ZHOU T,LV L,ZHANG Y C.Predicting missing links via local information[J].The European Physical Journal B,2009,71(4):623-630.
[6] CLAUSET A,MOORE C,NEWMAN M E J.Hierarchical structure and the prediction of missing links in networks[J].Nature,2008,453(7191):98.
[7] SARUKKAI R R.Link prediction and path analysis using Markov chains[J].Computer Networks,2000,33(1-6):377-386.
[8] SALTON G,MCGILL M J.Introduction to Modern Information Retrieval [M].Auckland:McGraw-Hill,1983.
[9] SØRENSEN T A.A method of establishing groups of equal amplitude in plant sociology based on similarity of species content and its application to analyses of the vegetation on Danish commons[J].Biologiske Skrifter,1948,5(4):1-34.
[10] NEWMAN M E J.Clustering and preferential attachment in growing networks.Physical Review E,2001,64(2):025102.
[11] LV L,JIN C H,ZHOU T.Similarity index based on local paths for link prediction of complex networks[J].Physical Review E,2009,80(4):046122.
[12] KATZ L.A new status index derived from sociometric analysis[J].Psychometrika,1953,18(1):39-43.
[13] LEICHT E A,HOLME P,NEWMAN M E J.Vertex similarity in networks[J].Physical Review E,2006,73(2):026120.
[14] LIU W,LV L.Link prediction based on local random walk[J].EPL (Europhysics Letters),2010,89(5):58007.
[15] WANG T,WANG H,WANG X.CD-Based indices for link prediction in complex network[J].PloS One,2016,11(1):e0146727.
[16] ADAMIC L A,ADAR E.Friends and neighbors on the web[J].Social Networks,2003,25(3):211-230.
[17] BARABÁSI A L,ALBERT R.Emergence of scaling in random networks[J].Science,1999,286(5439):509-512.
[18] GIRVAN M,NEWMAN M M E J.Community structure in social and biological networks[J].Proceedings of the National Academy of Sciences,2002,99(12):7821-7826.
[19] KUNEGIS J.Jazz musicians network dataset[OL].http:// konect.uni-koblenz.de/networks/arenas-jazz.
[20] BAIRD D,LUCZKOVICH J,CHRISTIAN R R.Assessment of spatial and temporal variability in ecosystem attributes of the St Marks National Wildlife Refuge,Apalachee Bay,Florida[J].Estuarine,Coastal and Shelf Science,1998,47(3):329-349.
[21] ULANOWICZ R E,BONDAVALLI C,EGNOTOVICH M S. Technical Report:CBL 98-123 (1998)[R/OL].http://www.cbl.umces.edu/atlss/FBay701.html.
[22] LUSSEAU D,SCHNEIDER K,BOISSEAU O J,et al.The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations[J].Behavioral Ecology and Sociobiology,2003,54(4):396-405.
[23] GUIMERA R,DANON L,DIAZ-GUILERA A,et al.Self-similar community structure in a network of human interactions[J].Physical Review E,2003,68(6):065103.
[24] LATORA V,MARCHIORI M.Efficient behavior of small- world networks[J].Physical Review Letters,2001,87(19):198701.
[25] LICHTENWALTER R N,LUSSIER J T,CHAWLA N V.New perspectives and methods in link prediction[C]//Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2010:243-252.
[26] NEWMAN M E J.Assortative mixing in networks[J].Physical Review Letters,2002,89(20):208701.
[27] YANG J,ZHANG X D.Predicting missing links in complex networks based on common neighbors and distance[J].Scientific Reports,2016,6:38208.
[1] 杨超, 刘志. 基于TASEP模型的复杂网络级联故障研究[J]. 计算机科学, 2020, 47(9): 265-269.
[2] 张梦月, 胡军, 严冠, 李慧嘉. 基于可见性图网络的中国专利申请关注度分析[J]. 计算机科学, 2020, 47(8): 189-194.
[3] 张清琪, 刘漫丹. 复杂网络社区发现的多目标五行环优化算法[J]. 计算机科学, 2020, 47(8): 284-290.
[4] 王慧, 乐孜纯, 龚轩, 武玉坤, 左浩. 基于特征分类的链路预测方法综述[J]. 计算机科学, 2020, 47(8): 302-312.
[5] 董明刚, 弓佳明, 敬超. 基于谱聚类的多目标进化社区发现算法研究[J]. 计算机科学, 2020, 47(6A): 461-466.
[6] 袁榕, 宋玉蓉, 孟繁荣. 一种基于加权网络拓扑权重的链路预测方法[J]. 计算机科学, 2020, 47(5): 265-270.
[7] 马扬, 程光权, 梁星星, 李妍, 杨雨灵, 刘忠. 有向加权网络中的改进SDNE算法[J]. 计算机科学, 2020, 47(4): 233-237.
[8] 刘苗苗,扈庆翠,郭景峰,陈晶. 符号网络链接预测算法研究综述[J]. 计算机科学, 2020, 47(2): 21-30.
[9] 张虎, 周晶晶, 高海慧, 王鑫. 融合节点结构和内容的网络表示学习方法[J]. 计算机科学, 2020, 47(12): 119-124.
[10] 李忠文, 丁烨, 花忠云, 李君一, 廖清. 结合三元组重要性的知识图谱补全模型[J]. 计算机科学, 2020, 47(11): 231-236.
[11] 阮子瑞,阮中远,沈国江. 基于交通路网的TASEP模型的扩展研究[J]. 计算机科学, 2020, 47(1): 265-269.
[12] 赵磊, 周金和. 基于复杂网络内容场的ICN能效优化策略[J]. 计算机科学, 2019, 46(9): 137-142.
[13] 陈晓军, 向阳. STransH:一种改进的基于翻译模型的知识表示模型[J]. 计算机科学, 2019, 46(9): 184-189.
[14] 陈航宇, 李慧嘉. 中国航空复杂网络的结构特征与应用分析[J]. 计算机科学, 2019, 46(6A): 300-304.
[15] 刘晓东, 魏海平, 曹宇. 考虑网络拓扑结构变化的SIRS模型的建立与稳定性分析[J]. 计算机科学, 2019, 46(6A): 375-379.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 雷丽晖,王静. 可能性测度下的LTL模型检测并行化研究[J]. 计算机科学, 2018, 45(4): 71 -75 .
[2] 孙启,金燕,何琨,徐凌轩. 用于求解混合车辆路径问题的混合进化算法[J]. 计算机科学, 2018, 45(4): 76 -82 .
[3] 张佳男,肖鸣宇. 带权混合支配问题的近似算法研究[J]. 计算机科学, 2018, 45(4): 83 -88 .
[4] 伍建辉,黄中祥,李武,吴健辉,彭鑫,张生. 城市道路建设时序决策的鲁棒优化[J]. 计算机科学, 2018, 45(4): 89 -93 .
[5] 史雯隽,武继刚,罗裕春. 针对移动云计算任务迁移的快速高效调度算法[J]. 计算机科学, 2018, 45(4): 94 -99 .
[6] 周燕萍,业巧林. 基于L1-范数距离的最小二乘对支持向量机[J]. 计算机科学, 2018, 45(4): 100 -105 .
[7] 刘博艺,唐湘滟,程杰仁. 基于多生长时期模板匹配的玉米螟识别方法[J]. 计算机科学, 2018, 45(4): 106 -111 .
[8] 耿海军,施新刚,王之梁,尹霞,尹少平. 基于有向无环图的互联网域内节能路由算法[J]. 计算机科学, 2018, 45(4): 112 -116 .
[9] 崔琼,李建华,王宏,南明莉. 基于节点修复的网络化指挥信息系统弹性分析模型[J]. 计算机科学, 2018, 45(4): 117 -121 .
[10] 杨羽琦,章国安,金喜龙. 车载自组织网络中基于车辆密度的双簇头路由协议[J]. 计算机科学, 2018, 45(4): 126 -130 .