基于特征分类的链路预测方法综述

doi:10.11896/jsjkx.190700136

摘要/Abstract

摘要： 复杂网络链路预测作为网络科学研究中一个重要的研究方向, 受到了越来越多来自各个学科领域专家的关注, 它可以利用现有的网络信息, 如节点和边缘的特征, 来预测未来可能形成的关系、网络中缺失的信息以及新的或正在消失的信息, 识别虚假交互, 评估网络演化机制, 进行网络重构等。当前链路预测的文献主要来自工程学、计算机科学与物理学的专家, 它们各自为政, 缺少合作, 结合多学科进行链路预测的综述论文少之又少。因此, 文中从计算机科学和物理学的视角全面回顾、分析和讨论基于特征分类的链路预测算法的研究进展, 介绍了该领域专家们提出的多种特征提取技术, 首次把分层的思想引入链路预测算法分类中, 将分类模型分为3层, 即元数据层、特征分类层和特征抽取层。该分类模型包括“2个大块7个方面”, 即把常用的链路预测算法分为2个大块(特征提取方法和特征学习方法)和7个方面(基于相似性的方法、基于似然分析的方法、基于概率模型的方法、矩阵分解方法、基于随机游走的方法、基于神经网络的方法和基于自定义损失函数的方法)。该分类方法覆盖了各学科中许多经典的和最新的链路预测技术, 包括当前最流行的图神经网络链路预测技术GNN(Graph Neural Network), GCN(Graph Convolutional Network), RNN(Recurrent Neural Network)和RL(Reinforcement Learning)。文中研究了这些算法的模型复杂性和预测性能的差异, 并对当前链路预测技术未来所面临的挑战进行了讨论。

关键词: 特征分类, 复杂网络, 机器学习, 链路预测, 图神经网络

Abstract: As an important research direction of network science research, link prediction of complex network has been more and more attention from experts from various disciplines, and it can make use of existing network information, such as the features of the nodes and edges, to predict possible future relationships, missing information in the network, and new or disappearing information, identify the false interactions, evaluate network evolution mechanism and conduct network reconfiguration, etc.Current articles on link prediction mainly come from the expert in engineering, computer science and physics.They are independent and lack cooperation, and there are few review articles combining multidisciplinary link prediction.The goal of this article is from the perspective of computer science and physics comprehensive to review, analyze, and discusse the research progress of link prediction algorithm based on feature classification, introduces a variety of feature extraction techniques in this field.This paper firstly introduces the idea of layering into the classification of link prediction algorithm, which divides the classification model into three layers, the metadata layer, features classification layer and feature extraction layer.The classification model includes two parts and seven aspects, that is, the prediction algorithm is divided into two parts, features extraction method and features learning method.The seven aspects are the similarity based methods, probabilistic methods, likelihood based methods, matrix factorization based method, random walk based method, neural network based method and custom loss function based method.This classification method covers many classical and latest link prediction techniques in various disciplines, including GNN, GCN, RNN and RL, which are the most popular link prediction techniques in graph neural networks.The differences of these algorithms in model complexity and prediction performance are studied, and the challenges of current link prediction technology in the future are discussed.

Key words: Complex network, Feature classification, Graph neural network, Link prediction, Machine learning

中图分类号:

TP391

王慧, 乐孜纯, 龚轩, 武玉坤, 左浩. 基于特征分类的链路预测方法综述[J]. 计算机科学, 2020, 47(8): 302-312. https://doi.org/10.11896/jsjkx.190700136

WANG Hui, LE Zi-chun, GONG Xuan, WU Yu-kun, ZUO Hao. Review of Link Prediction Methods Based on Feature Classification[J]. Computer Science, 2020, 47(8): 302-312. https://doi.org/10.11896/jsjkx.190700136

参考文献

[1]RAMESH R S.Link prediction and path analysis using markov chains[J].Computer Networks, 2000, 60(33):377-386.
[2]LIBEN-NOWEL D, KLEINBERGN, JON K.The link-prediction problem forsocial networks[J].Journal of The American Society For InformationScience and Technology, 2007, 58(7):1019-1031.
[3]LV L L.Link prediction on complex networks[J].Journal ofUniversity of Electronic Science and Technology of China, 2010, 39(5):651-661.
[4]HASAN M A, ZAKI M.A survey of link prediction in socialnetworks[J].Social Network Data Analytics, 2011, 40(5):243-275.
[5]LV L L, REN X L, ZHOU T.Network link prediction:concepts and frontiers [J].Communication of China Computer Society, 2016, 12(4):12-21.
[6]VALVERDE R J, LOPES A A.Exploiting behaviors of communities of twitter users for link prediction[J].Soc. Netw. Anal. Min., 2013, 20(3):1063-1074.
[7]LIU H, HU Z, HADDADI H.Hidden link prediction based on node centrality and weak ties[J].Europhys Lett., 2013, 101(1):65-75.
[8]YANG S H, LONG B, SMOLA A, et al.Like like alike:jointfriendship and interest propagation in social networks[C]∥Proceedings of the 20th International Conference on World Wide Web.ACM, 2011:537-546.
[9]CHO E, MYERS S A, LESKOVEC J.Friendship and mobility:user movement in location-based social networks[C]∥Procee-dings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM.2011:1082-1090.
[10]LEO K.A new status index derived from sociometric analysis[J].Psychometrika, 1953, 50(18):39-43.
[11]ELIZABETH A L, PETTER H, MARK E N.Vertex similarityin networks[J].Physical Review E, 2006, 73(2):121-130.
[12]JEH G, WIDOM J.SimRank:a measure of structural-contextsimilarity[C]∥Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’02).ACM, 2002:538-543.
[13]KLOUMANN I M, UGANDER J, KLEINBERG J.Block mo-dels and personalized PageRank[J].The National Academy of Sciences, 2017, 114(1):33-38.
[14]SHANG M S, L L, ZENG W, et al.Relevance is more significant than correlation:Information filtering on sparse data[J].Europhys Lett., 2009, 88(6):68008p1-68008p4.
[15]NEWMAN M E J.Clustering and preferential attachment ingrowing networks[J].Phys. Rev. E., 2001, 64(10):251-264.
[16]SALTON G, MCGILLM.Introduction to Modern InformationRetrieval[J].McGraw-Hill, 1983, 32(6):528-536.
[17]ADAMIC L A, ADAR E.Friend and neighbors on the web[J].Soc Networks, 2003, 25(6):211-230.
[18]LIU Y Y, ZHAO C L, WANG X J.The degree-related clustering coefficient and its application to link prediction[J].Physica A, 2016, 454(15):24-33.
[19]ZHOU T, L L, ZHANG Y C.Predicting missing links via local information[J].Eur. Phys. J. B., 2009, 80(71):623-630.
[20]RAVASZ E, SOMERA A L, MONGRU D, et al.Hierarchicalorganization of modularity in metabolic networks[J].Science, 2002, 297(12):1551-1558.
[21]LV L, JIN C H, ZHOU T.Similarity index based on local paths for link prediction of complex network[J].Phys. Reve., 2009, 80(4):211-223.
[22]WANG P, XU B W, WU Y R.Link predictionin social net-works:the state-of-the-art[J].Science China Information Scien-ces, 2015, 58(1):1-38.
[23]HU R J, TANG J X, YUAN Y N.Link prediction in complex networks based on the interactions among paths[J].Physical A, 2019, 510(4):52-67.
[24]ALEXIS P, PANAGIOTIS S, YANNIS M.Fast and accuratelink prediction in social networking systems[J].Systems and Software, 2012, 60(9):2119-2132.
[25]ZHU X, TIAN H, CAI S, et al.Predicting missing links via significant paths[J].EPL, 2014, 108(4):18008-18014.
[26]AARON C, CRISTOPHER M, MARK E J.Hierarchicalstructure and theprediction of missing links in networks[J].Nature, 2008, 453(71):98-101.
[27]GUIMER R, SALES P M.Missing and spurious interactionsand the reconstruction of complex networks[J].PNAS, 2009, 106(52):22073-22078.
[28]BEN T, PIETER A, WONG M F.Relationalmarkov networks[J].Introduction to Statistical Relational Learning, 2007, 40(13):175-200.
[29]LV L Y.ZHOU T.Link prediction in complex networks:A survey[J].Physica A:Statistical Mechanics and Its Applications, 2011, 390(6):1150-1170.
[30]MENON A K, ELKAN C.Link prediction via matrix factorization in Machine Learning and Knowledge Discovery in Databases[C]∥Springer.2011:437-452.
[31]ADITYA K M, CHARLES E.Link Prediction via Matrix Factorization[C]∥Proceedings of the 2011 European Conference on Machine Learning and Knowledge Discovery in Databases.ACM, 2011:437-452.
[32]KUNEGIS J, LOMMATZSCH A.Learning spectral graphtransformations for link prediction[C]∥ Proceedings of the 26th Annual International Conference on Machine Learning.ACM, 2009:561-568.
[33]CHEN Z, ZHANG W.A Marginalized Denoising Method forLink Prediction in Relational Data[C]∥ Proceedings of the 2014 SIAM International Conference on Data Mining.ACM, 2015:64-72.
[34]DAI L Y, KONG X Z, YUAN S S, et al.Integrative graph regularized matrix factorization for drug-pathway associations analysis[J].Computational Biology and Chemistry, 2019, 78(7):474-480.
[35]REN Y W, ZHOU J L, WANG J.Quality-Relevant Fault Monitoring Based on Locally Linear Embedding Orthogonal Projection to Latent Structure[J].Industrial & Engineering Chemistry Research, 2019, 58(3):1262-1272.
[36]OU M D, CUI P, PEI J, et al.Asymmetrictransitivity preserving graph embedding[C]∥Proceedings of the 22nd ACMSIGKDD International Conference on Knowledge Discovery and Data Mining.ACM, 2017:1105-1114.
[37]BELKIN M, NIYOGI P.Laplacian eigenmaps and spectral techniques for embedding and clustering[C]∥Advances in Neural Information Processing Systems.2002:585-591.
[38]CAO S, LU W, XU Q.Grarep:Learning graph representations with global structural information[C]∥Proceedings of the 24th ACM International on Conference on Information and Know-ledge Management.ACM, 2015:891-900.
[39]WANG H W, WANG J L, XIE X, et al.GraphGAN:Graph representation learningwith generative adversarial nets[J].IEEE Transaction on Knowledge and Data Engineering, 2017, 30(22):11-19.
[40]CAI L J, XU Y B, HE T Q, et al.A New Algorithm of DeepWalk Based On Probability[J].Journal of Physics:Conference Series, 2019, 1069(1):130-135.
[41]GROVER A, LESKOVEC J.Node2Vec:Scalable Feature learning for Network[J].HHS Public Access, 2016, 14(8):855-864.
[42]WANG D, CUI P, ZHU W.Structural deep network embedding[C]∥Proceedings of the 22nd International Conference on Knowledge Discovery and Data Mining.ACM, 2016:1225-1234.
[43]FRENSNO D, CARLOS W L, SARAI M C, et al.DNGR-1 in dendritic cells limits tissue damage by dampening neutrophil recruitment[J].Science, 2019, 6412(362):351-356.
[44]TOMAS N K, MAX W, Variational Graph Auto-Encoders[J].Springer, 2016, 28(3):61-63.
[45]LUCA F, MATHIAS N, MASSIMILIANO P.Learning Discrete
Structures for Graph Neural Networks[J].ICML.2019, 30(11):960-973.
[46]YOU J X, REX Y, XIANG R.Graphrnn:Generating realisticgraphs with deepauto-regressive models[C]∥International Conference on Machine Learning.ACM, 2018:5694-5703.
[47]MONTI F, BRONSTEIN M, BRESSON X.Geometric matrixcompletionwith recurrent multi-graph neural networks[J].Advances in NeuralInformation Processing Systems, 2017, 30(4):3697-3707.
[48]BUI L Q, PHAN A V, NGUYEN Y L H.DGCNN:A convolutional neural network over large-scale labeled graphs[J].Neural Networks:The Official Journal of the International Neural Network Society, 2017, 108(11):533-543.
[49]CAO D N, KIPF T.Molgan:An implicit generative model forsmall molecular graphs[J].ICML 2018 Workshop on Theoretical Foundations and Applications of Deep Generative Models, 2018, 97(30):181-192.
[50]TANG J, QU M, WANG M, et al.Line:Large-scale information network embedding[C]∥Ternational Conference on Word Wide Web.ACM, 2015:1067-1077.
[51]KIM Y H, YOO J, KIM S J, et al.Comparison of Scattering Cross-Sections by MCNP5 and TRANSX/TWODANT Codes in Sodium-Cooled Fast Reactor[J].Transactions of the American Nuclear Society, 2018, 106(1):719-722.
[52]WANG H, LE Z C, GONG X, et al.Link predicton of complex network is analyzed from the perspective of informatics[J].Journal of Chinese Computer Systems, 2020, 41(2):316-326.

相关文章 15

[1]	周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[2]	郑文萍, 刘美麟, 杨贵. 一种基于节点稳定性和邻域相似性的社区发现算法 Community Detection Algorithm Based on Node Stability and Neighbor Similarity 计算机科学, 2022, 49(9): 83-91. https://doi.org/10.11896/jsjkx.220400146
[3]	冷典典, 杜鹏, 陈建廷, 向阳. 面向自动化集装箱码头的AGV行驶时间估计 Automated Container Terminal Oriented Travel Time Estimation of AGV 计算机科学, 2022, 49(9): 208-214. https://doi.org/10.11896/jsjkx.210700028
[4]	宁晗阳, 马苗, 杨波, 刘士昌. 密码学智能化研究进展与分析 Research Progress and Analysis on Intelligent Cryptology 计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053
[5]	闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[6]	何强, 尹震宇, 黄敏, 王兴伟, 王源田, 崔硕, 赵勇. 基于大数据的进化网络影响力分析研究综述 Survey of Influence Analysis of Evolutionary Network Based on Big Data 计算机科学, 2022, 49(8): 1-11. https://doi.org/10.11896/jsjkx.210700240
[7]	李瑶, 李涛, 李埼钒, 梁家瑞, Ibegbu Nnamdi JULIAN, 陈俊杰, 郭浩. 基于多尺度的稀疏脑功能超网络构建及多特征融合分类研究 Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network 计算机科学, 2022, 49(8): 257-266. https://doi.org/10.11896/jsjkx.210600094
[8]	张光华, 高天娇, 陈振国, 于乃文. 基于N-Gram静态分析技术的恶意软件分类研究 Study on Malware Classification Based on N-Gram Static Analysis Technology 计算机科学, 2022, 49(8): 336-343. https://doi.org/10.11896/jsjkx.210900203
[9]	陈明鑫, 张钧波, 李天瑞. 联邦学习攻防研究综述 Survey on Attacks and Defenses in Federated Learning 计算机科学, 2022, 49(7): 310-323. https://doi.org/10.11896/jsjkx.211000079
[10]	齐秀秀, 王佳昊, 李文雄, 周帆. 基于概率元学习的矩阵补全预测融合算法 Fusion Algorithm for Matrix Completion Prediction Based on Probabilistic Meta-learning 计算机科学, 2022, 49(7): 18-24. https://doi.org/10.11896/jsjkx.210600126
[11]	杨炳新, 郭艳蓉, 郝世杰, 洪日昌. 基于数据增广和模型集成策略的图神经网络在抑郁症识别上的应用 Application of Graph Neural Network Based on Data Augmentation and Model Ensemble in Depression Recognition 计算机科学, 2022, 49(7): 57-63. https://doi.org/10.11896/jsjkx.210800070
[12]	李亚茹, 张宇来, 王佳晨. 面向超参数估计的贝叶斯优化方法综述 Survey on Bayesian Optimization Methods for Hyper-parameter Tuning 计算机科学, 2022, 49(6A): 86-92. https://doi.org/10.11896/jsjkx.210300208
[13]	赵璐, 袁立明, 郝琨. 多示例学习算法综述 Review of Multi-instance Learning Algorithms 计算机科学, 2022, 49(6A): 93-99. https://doi.org/10.11896/jsjkx.210500047
[14]	肖治鸿, 韩晔彤, 邹永攀. 基于多源数据和逻辑推理的行为识别技术研究 Study on Activity Recognition Based on Multi-source Data and Logical Reasoning 计算机科学, 2022, 49(6A): 397-406. https://doi.org/10.11896/jsjkx.210300270
[15]	黄璞, 沈阳阳, 杜旭然, 杨章静. 基于局部约束特征线表示的人脸识别 Face Recognition Based on Locality Constrained Feature Line Representation 计算机科学, 2022, 49(6A): 429-433. https://doi.org/10.11896/jsjkx.210300169

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed