计算机科学 ›› 2020, Vol. 47 ›› Issue (12): 119-124.doi: 10.11896/jsjkx.190900027

• 数据库&大数据&数据科学 • 上一篇    下一篇

融合节点结构和内容的网络表示学习方法

张虎, 周晶晶, 高海慧, 王鑫   

  1. 山西大学计算机与信息技术学院 太原 030006
  • 收稿日期:2019-09-09 修回日期:2020-04-03 出版日期:2020-12-15 发布日期:2020-12-17
  • 通讯作者: 张虎(zhanghu@sxu.edu.cn)
  • 基金资助:
    国家社会科学基金(18BYY074);国家自然科学基金(6193601261806117);山西省高等学校科技创新项目(201802012)

Network Representation Learning Method on Fusing Node Structure and Content

ZHANG Hu, ZHOU Jing-jing, GAO Hai-hui, WANG Xin   

  1. School of Computer and Information Technology Shanxi University Taiyuan 030006,China
  • Received:2019-09-09 Revised:2020-04-03 Online:2020-12-15 Published:2020-12-17
  • About author:ZHANG Hu,born in 1979Ph.Dasso-ciate professoris a member of China Computer Federation.His main research interests include Natural Language Processing and representation learning.
  • Supported by:
    National Social Science Fund of China(18BYY074),National Natural Science Foundation of China (61936012,61806117) and Scientific and Technological Innovation Programs of Higher Education Institutions in Shanxi (201802012).

摘要: 随着神经网络技术的快速发展面向复杂网络数据的网络表示学习方法受到越来越多的关注其旨在学习网络中节点的低维度潜在表示并将学习到的特征表示有效应用于基于图的各种分析任务.典型的浅层随机游走网络表示学习模型主要基于节点结构相似和节点内容相似不能同时有效捕获节点结构和内容的相似信息因此在结构和内容等价混合的网络数据上表现较差.为此探索了节点结构相似和节点内容相似的融合特征提出了一种基于无监督浅层神经网络联合学习的表示方法SN2vec.实验分别利用节点结构和内容等价混合的Brazilianair-trafficAmericanair-trafficWikipedia数据集在多标签分类和降维可视化任务上进行验证.结果显示SN2vec在多标签分类任务中的Micro-F1值优于现有的浅层随机游走网络表示方法并且可以较好地学习到潜在结构表示一致的节点.

关键词: 复杂网络, 浅层神经网络, 随机游走, 网络表示学习

Abstract: With the rapid development of neural network Technology the network representation learning method for complex network has got more and more attention.It aims to learn the low-dimensional potential representation of nodes in the network and to apply the learned characteristic representation effectively to various analysis tasks for graph data.The typical shallow random walk network representation model is mainly based on two kinds of characteristic representation methodswhich are the node structure similarity and node content similarity.Howeverthe methods can't effectively capture similar information of node structure and content at the same timeand perform poorly on the network data with the equivalent structure and content.To this endthis paper explores the fusion characteristics of node structure and node contentand proposes a representation method called SN2vecwhich is based on joint learning of unsupervised shallow neural networks.Furtherin order to validate the effectiveness of the proposed modelthis paper respectively conduct the multi-label classification and down-dimensional visualization tasks in Brazilian air-trafficAmerican air-trafficand Wikipedia datasets.The results show that the Micro-F1 of using SN2vec in multi-label classification task is better than the existing shallow random walk network representation methodsand SN2vec can also learn better potential structural representation of consistent nodes.

Key words: Complex network, Network representation learning, Random walk, Shallow neural network

中图分类号: 

  • TP391
[1] GOYAL P,FERRARA E.Graph Embedding Techniques,Applications,and Performance:A Survey[J].Knowledge-Based Systems,2017,2(17):155-164.
[2] TANG L,LIU H.Leveraging social media networks for classification[J].Data Mining and Knowledge Discovery,2011,23(3):447-478.
[3] HAMILTON W L,YING R,LESKOVEC J.Representationlearning on graphs:Methods and applications[J].arXiv,2017:1709.05584.
[4] BENGIO Y,COURVILLE A,VINCENT P.Representationlearning:A review and new perspectives[J].IEEE TPAMI,2013,35(8):1798-1828.
[5] OU M D,CUI P,PEI J,et al.Asymmetric transitivity preserving graph embedding[C]//22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2016:1105-1114.
[6] PENNINGTON J,SORCHER R,MANNING C D.GloVe:Global vectors for word respresentation[C]//Conference on Empirical Methods in Natural Language Processing.2014.
[7] CAI H Y,ZHENG V W,CHANG K.A Comprehensive Survey of Graph Embedding:Problems,Techniques and Applications[J].arXiv:2017:1709.07604.
[8] WILLIAM L H,YING R,LESKOVEC J.Inductive Representation Learning on Large Graphs[J].arXiv:2017:1706.02216.
[9] BELKIN,MIKHAIL,PARTHA N.Laplacian eigenmaps andspectral techniques for embedding and clustering[J].Advances in Neural Information Processing Systems,2001,14(6):585-591.
[10] WOLD S,ESBENSEN K,GELADI P.Principal component ana-lysis[J].Chemometrics and Intelligent Laboratory Systems,1987,2(1/3):37-52.
[11] JOSEPH B,KRUSKAL,WISH M.Multidimensional scaling[M]//Methods,1978:116.
[12] TU C C,YANG C,LIU Z Y.A summary of network represents learning[J].Chinese Science:Information Science,2017,47(8):980-996.
[13] ZHOU J,CUI G Q,ZHANG Z Y,et al.Graph Neural Net-works:A Review of Methods and Applications[J].arXiv:1812.08434.
[14] WANG D X,CUI P,ZHU W W.Structural Deep Network Embedding[C]//The 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2016:1225-1234.
[15] YANG C,LIU Z Y ZHAO D L,et al.Network representation learning with rich text information[C]//International Joint Conference on Artificial Intelligence (IJCAI).2015:2111-2117.
[16] LI Q H,LI C P,ZHANG J,et al.Survey of Compressed Deep Neural Network[J].Computer Science,2019,46(9):1-14.
[17] MIKOLOV T,SUTSKEVER I,CHEN K,et al.DistributedRepresentations of Words and Phrases and their Compositionality[C]//Annual Conference on Neural Information Processing Systems (NIPS).2013:3111-3119.
[18] PEROZZI B,AL-RFOU R,SKIENA S.DeepWalk:OnlineLearning of Social Representations[C]//The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD).2014:701-710.
[19] TANG T,QU M,WANG M,et al.LINE:Large-scale Information Network Embedding[C]//The 24th InternationalConfe-rence on World Wide Web (WWW).2015:1067-1077.
[20] GROVER A,LESKOVEC J.node2vec:Scalable Feature Learning for Networks[C]//22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2016:855-864.
[21] LEONARDO F,RIBEIRO R,PEDRO H P,et al.struc2vec:Learning Node Representations from Structural Identity[C]//The 23rd ACM SIGKDD International Conference on Know-ledge Discovery and Data Mining(KDD).2017:385-394.
[22] XU H,LIU H,WANG W,et al.NE-FLGC:Network Embed-ding Based on Fusing Local (First-Order) and Global (Second-Order) Network Structure with Node Content[C]//Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD).2018:260-271.
[23] CAO S H,LU W,XU Q K.GraRep:Learning Graph Representations with Global Structural Information[C]//The 24th ACM International Conference on Knowledge Discovery and Data Mining (KDD).2015:1105-1114.
[24] KAZEMI SEYED M,GOEL R,JAIN K,et al.Relational repre-sentation learning for dynamic (knowledge) graphs:a survey[J].arXiv:2019:1905.11485.
[25] Tsinghua University built on open source framework OpenNE[EB/OL].http://tech.ifeng.com/a/20171028/44733568_0.shtml.
[1] 郑文萍, 刘美麟, 杨贵.
一种基于节点稳定性和邻域相似性的社区发现算法
Community Detection Algorithm Based on Node Stability and Neighbor Similarity
计算机科学, 2022, 49(9): 83-91. https://doi.org/10.11896/jsjkx.220400146
[2] 杨波, 李远彪.
数据科学与大数据技术课程体系的复杂网络分析
Complex Network Analysis on Curriculum System of Data Science and Big Data Technology
计算机科学, 2022, 49(6A): 680-685. https://doi.org/10.11896/jsjkx.210800123
[3] 何茜, 贺可太, 王金山, 林绅文, 杨菁林, 冯玉超.
比特币实体交易模式分析
Analysis of Bitcoin Entity Transaction Patterns
计算机科学, 2022, 49(6A): 502-507. https://doi.org/10.11896/jsjkx.210600178
[4] 王本钰, 顾益军, 彭舒凡, 郑棣文.
融合动态距离和随机竞争学习的社区发现算法
Community Detection Algorithm Based on Dynamic Distance and Stochastic Competitive Learning
计算机科学, 2022, 49(5): 170-178. https://doi.org/10.11896/jsjkx.210300206
[5] 陈世聪, 袁得嵛, 黄淑华, 杨明.
基于结构深度网络嵌入模型的节点标签分类算法
Node Label Classification Algorithm Based on Structural Depth Network Embedding Model
计算机科学, 2022, 49(3): 105-112. https://doi.org/10.11896/jsjkx.201000177
[6] 赵学磊, 季新生, 刘树新, 李英乐, 李海涛.
基于路径连接强度的有向网络链路预测方法
Link Prediction Method for Directed Networks Based on Path Connection Strength
计算机科学, 2022, 49(2): 216-222. https://doi.org/10.11896/jsjkx.210100107
[7] 李家文, 郭炳晖, 杨小博, 郑志明.
基于信息传播的致病基因识别研究
Disease Genes Recognition Based on Information Propagation
计算机科学, 2022, 49(1): 264-270. https://doi.org/10.11896/jsjkx.201100129
[8] 蒋宗礼, 樊珂, 张津丽.
基于生成对抗网络和元路径的异质网络表示学习
Generative Adversarial Network and Meta-path Based Heterogeneous Network Representation Learning
计算机科学, 2022, 49(1): 133-139. https://doi.org/10.11896/jsjkx.201000179
[9] 穆俊芳, 郑文萍, 王杰, 梁吉业.
基于重连机制的复杂网络鲁棒性分析
Robustness Analysis of Complex Network Based on Rewiring Mechanism
计算机科学, 2021, 48(7): 130-136. https://doi.org/10.11896/jsjkx.201000108
[10] 胡军, 王雨桐, 何欣蔚, 武晖栋, 李慧嘉.
基于复杂网络的全球航空网络结构分析与应用
Analysis and Application of Global Aviation Network Structure Based on Complex Network
计算机科学, 2021, 48(6A): 321-325. https://doi.org/10.11896/jsjkx.200900112
[11] 王学光, 张爱新, 窦炳琳.
复杂网络上的非线性负载容量模型
Non-linear Load Capacity Model of Complex Networks
计算机科学, 2021, 48(6): 282-287. https://doi.org/10.11896/jsjkx.200700040
[12] 马媛媛, 韩华, 瞿倩倩.
基于节点亲密度的重要性评估算法
Importance Evaluation Algorithm Based on Node Intimate Degree
计算机科学, 2021, 48(5): 140-146. https://doi.org/10.11896/jsjkx.200300184
[13] 殷子樵, 郭炳晖, 马双鸽, 米志龙, 孙怡帆, 郑志明.
群智体系网络结构的自治调节:从生物调控网络结构谈起
Autonomous Structural Adjustment of Crowd Intelligence Network: Begin from Structure of Biological Regulatory Network
计算机科学, 2021, 48(5): 184-189. https://doi.org/10.11896/jsjkx.210200161
[14] 刘胜久, 李天瑞, 谢鹏, 刘佳.
带权图的多重分形度量
Measure for Multi-fractals of Weighted Graphs
计算机科学, 2021, 48(3): 136-143. https://doi.org/10.11896/jsjkx.200700159
[15] 龚追飞, 魏传佳.
基于改进AdaBoost算法的复杂网络链路预测
Link Prediction of Complex Network Based on Improved AdaBoost Algorithm
计算机科学, 2021, 48(3): 158-162. https://doi.org/10.11896/jsjkx.200600075
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!