基于特征学习的链路预测模型TNTlink

doi:10.11896/jsjkx.190700020

摘要/Abstract

摘要： 在合作作者网络中链路预测可以预测当前网络中缺失的链接以及新的或已解散的链接根据网络中观测到的信息来推断两位作者在不久的将来是否会产生合作对于挖掘和分析网络的演化、重塑网络模型具有重要意义.链路预测是计算机科学和物理学的重要研究方向对此已有较深入的研究其主要研究思路是基于马尔可夫链、机器学习和无监督的学习.然而这些工作大多只使用单一的特征即基于网络拓扑特征或者属性特征进行预测很少将这些跨学科的特征组合考虑结合多学科特征进行链路预测的研究非常少.文中设计开发了TNTlink模型该模型结合网络拓扑特征、基本特征和附加特征并结合物理学和计算机科学的领域知识利用深度神经网络将这些特征集成到一个深度学习框架中其在解决链路预测问题时取得了不错的效果.文中使用了5个数据集(ca-AstroPhca-CondMatca-GrQcca-HepPh和ca-HepTh)包含69032个节点和450617条边从捕获的信息中利用二进制相似度和模糊余弦相似度计算和识别特征.如果节点在这些特征中表现出更多的相似性(如相似的节点、相同的关键字或彼此之间密切的关系)则两个节点间更有可能生成链接.除了考虑节点的特征外还考虑了节点重要性对链路形成的影响进而提出了一种新的链路预测指标MI以区分强影响和弱影响对节点的重要影响进行建模.将所提模型与主流分类器在5个数据集上进行比较结果表明MI和TNTlink有效地提高了链路预测的AUC值.

关键词: 附加特征, 基本特征, 链路预测, 模糊余弦相似性, 深度学习, 拓扑特征

Abstract: In the co-author networklink prediction can predict the missing links in the current network and the new or disbanded links.It is of great significance for mining and analyzing the evolution of the network and remaking the network model to infer whether the two authors will cooperate in the near future according to the observed information in the network.As an important research direction of computer science and physicslink prediction has been studied in depth up to now.Their main research idea is based on the markov chainmachine learning and unsupervised learning.Howevermost of these work use only a single featurenamely the network topology features or attribute features to predictfew will consider these interdisciplinary featuresand papers combined with multidisciplinary on link prediction are fewer.This paper designed and developed the TNTlink model.This model combines the network topology featuresbasic featues and the additional featurescombines physics and computer science domain knowledgeand uses the depth of neural network to integrate these features into a deep learning framework dealing with the problem of link predictionand good results have been achieved.Five data sets (ca-astrophca-condmatca-grqcca-hepph and ca-hepth) were used in this papercontaining 69032 nodes and 450617 edges.Binary similarity and fuzzy cosine similarity were used to calculate and identify these features from captured information.If nodes show more similarity in these features (for examplesimilar nodesthe same keywordsor a close relationship between them)the two nodes are more likely to generate links.Besides the features of nodesthe influence of node importance on link formation was also considered.A new link prediction index MI was proposed to distinguish strong effects from weak effects and to model the important effects of nodes.The proposed model was compared with mainstream classifiers on five datasets.The results show that MI and TNTlink can effectively improve link prediction AUC value.

Key words: Additional features, Basic features, Deep learning, Fuzzy cosine similarity, Link prediction, Topological features

中图分类号:

TP391

王慧, 乐孜纯, 龚轩, 左浩, 武玉坤. 基于特征学习的链路预测模型TNTlink[J]. 计算机科学, 2020, 47(12): 245-251. https://doi.org/10.11896/jsjkx.190700020

WANG Hui, LE Zi-chun, GONG Xuan, ZUO Hao, WU Yu-kun. TNTlink Prediction Model Based on Feature Learning[J]. Computer Science, 2020, 47(12): 245-251. https://doi.org/10.11896/jsjkx.190700020

参考文献

[1] BLAGUS N,ŠUBELJ L,BAJEC M.Self-similar scaling of density in complex real-world networks[J].Physica A,2012,391(8):2794-2802.
[2] LICHTENWALTER R N,CHAWLA N V.Vertex collocationprofiles:subgraph counting for link analysis and prediction[C]//Proceedings of the 21st World Wide Web Conference(WWW'12).ACM,2012:1019-1028.
[3] LI X,CHEN H.Recommendation as link prediction in bipartite graphs:a graph kernel-based machine learning approach[J].Decis Support Syst,2013,54(3):880-890.
[4] SCELLATO S,NOULAS A,MASCOLO C.Exploiting placefeatures in link prediction on location-based social networks[C]//Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.San Diego,2011:1046-1054.
[5] WANG P,XU B,WU Y,et al.Link prediction in socialnet-works:the state-of-the-art[J].Science China Information Sciences,2015,45(9):1-38.
[6] PAVLOV M,ICHISE R.Finding experts by link prediction in co-authorship networks[C]//Proceedings of the 2^nd Interna-tional ISWC+ASWC Workshop on Finding Experts on the Web with Semantics (FEWS).Busan,2007:42-55.
[7] ICHISE R,WOHLFARTH T.Semantic and event-based ap-proach for link prediction[C]//Proceedings of the 7th International Conference on Practical Aspects of Knowledge Management (PAKM'08).Yokohama,2008:50-61.
[8] HASAN M I,CHAOJI V,SALEM S,et al.Link predictionusing supervised learning[J].Counterterrorismand Security,2006,10(6):121-136.
[9] WANG J,RONG L L.Similarity index based on the information of neighbor nodes for link prediction of complex network[J].Modern physics letters B,2013,27(6):1350039-1350049.
[10] ADITY G,LESKOVEC J.node2vec:Scalable feature learningfor networks[C]//ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2016:855-864.
[11] FENG L,LIU B Q,SUN C J,et al.Deep belief network based approaches for link prediction in signed social networks[J].Entropy,2015,17(4):2140-2169.
[12] YANG X H,YU J,ZHANG D.Link prediction method based on local community and nodes' relativity[J].Computer Science,2019,46(1):155-161.
[13] BARABSI A L,JEONG H,NEDA Z,et al.Evolution of the social network of scientific collaborations[J].Physica A,2002,311(7):590-614.
[14] ZHANG C,OSMAR R,ZAIAN E.Neighbor-based link prediction with edge uncertainty[J].Advances in KnowledgeDisco-very and Data,2019,36(12):462-474.
[15] YANG X H,YANG X H,LING F.Link prediction based on local major path degree[J].Modern Physics Letter B,2018,32(1):29-35.
[16] GUNAWAN D,SEMBIRING C A,BUDIMAN M A.The implementation of cosine similarity to calculate text relevance between two documents[J].Journal of Physics Conference Series,2018,978(1):1-7.
[17] LE C Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,7553(521):436-444.
[18] CHAN W,KE N R,LANE L.Transferring knowledge from a RNN to a DNN[J].Interspeech,2015,10(6):3264-3268.
[19] LUO D S,WANG Y,HAN X Q.A cyclic contrastive divergence learning algorithm for high-order RBMS[J].IEEE,2014,18(10):3-6.
[20] AOUAY S,JAMOUSSI S,GARGOURI F,et al.Feature based link prediction[C]//2014 IEEE/ACS 11^th InternationalConfe-rence on Computer Systems and Applications.IEEE,2014:10-13.
[21] THI D B,ICHISE R,LE B.Link Prediction in Social Networks Based on Local Weighted Paths[J].Future Data and Security Engineering,2014,21(19):151-163.
[22] DONG Y X,TANG J,WU S.Link prediction and recommendation across heterogeneous social networks[C]//IEEE International Conference on Data Mining.IEEE Computer Society,2012:181-190.
[23] NOWELL D L,KLEINBERG J.The link-prediction problem for social networks[J].Journal of the American Society for Information Science and Technology ,2007,58(7):1019-1031.
[24] ZENG S.Link prediction based on local information considering preferential attachment[J].Physica A,2016,443(2):537-542.
[25] LICHTENWALTER R N,LUSSIER J T,CHAWLA N V.New perspectives and methods in link prediction[C]//Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2010:243-252.
[26] LU L,ZHOU T.Link prediction in complex networks:A survey[J].Phys.A,2011,28(6):1150-1170.
[27] ZHANG J.Uncovering mechanisms of co-authorship evolution by multirelations-based link prediction[J].Information Proc,2017,53(1):42-51.
[28] ZHU Y X,LU L Y,ZHANG Q M,et al.Uncovering missing links with cold ends[J].Physica A,2012,369(5):57-69.
[29] WU S,SUN J,TANG J.Patent partner recommendation in enterprise social networks[C]//Proceedings of the 6th ACM International Conference on Web Search and Data Mining.ACM,2013:43-52.
[30] WANG H,LE Z C,GONG X,et al.Link predicton of complex network is analyzed from the perspective of informatics[J].Journal of Chinese Computer Systems,2020,41(2):316-326.

相关文章 15

[1]	饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2]	汤凌韬, 王迪, 张鲁飞, 刘盛云. 基于安全多方计算和差分隐私的联邦学习方案 Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy 计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[3]	徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺. 时序知识图谱表示学习 Temporal Knowledge Graph Representation Learning 计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[4]	王剑, 彭雨琦, 赵宇斐, 杨健. 基于深度学习的社交网络舆情信息抽取方法综述 Survey of Social Network Public Opinion Information Extraction Based on Deep Learning 计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[5]	郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法 Class Discriminative Universal Adversarial Attack for Text Classification 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[6]	姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[7]	孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[8]	侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木. 中文预训练模型研究进展 Advances in Chinese Pre-training Models 计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[9]	周慧, 施皓晨, 屠要峰, 黄圣君. 基于主动采样的深度鲁棒神经网络学习 Robust Deep Neural Network Learning Based on Active Sampling 计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[10]	苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫. 小样本雷达辐射源识别的深度学习方法综述 Survey of Deep Learning for Radar Emitter Identification Based on Small Sample 计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[11]	胡艳羽, 赵龙, 董祥军. 一种用于癌症分类的两阶段深度特征选择提取算法 Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification 计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[12]	程成, 降爱莲. 基于多路径特征提取的实时语义分割方法 Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction 计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[13]	刘伟业, 鲁慧民, 李玉鹏, 马宁. 指静脉识别技术研究综述 Survey on Finger Vein Recognition Research 计算机科学, 2022, 49(6A): 1-11. https://doi.org/10.11896/jsjkx.210400056
[14]	孙福权, 崔志清, 邹彭, 张琨. 基于多尺度特征的脑肿瘤分割算法 Brain Tumor Segmentation Algorithm Based on Multi-scale Features 计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217
[15]	康雁, 徐玉龙, 寇勇奇, 谢思宇, 杨学昆, 李浩. 基于Transformer和LSTM的药物相互作用预测 Drug-Drug Interaction Prediction Based on Transformer and LSTM 计算机科学, 2022, 49(6A): 17-21. https://doi.org/10.11896/jsjkx.210400150

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed