计算机科学 ›› 2020, Vol. 47 ›› Issue (12): 119-124.doi: 10.11896/jsjkx.190900027

• 数据库&大数据&数据科学 • 上一篇    下一篇

融合节点结构和内容的网络表示学习方法

张虎, 周晶晶, 高海慧, 王鑫   

  1. 山西大学计算机与信息技术学院 太原 030006
  • 收稿日期:2019-09-09 修回日期:2020-04-03 出版日期:2020-12-15 发布日期:2020-12-17
  • 通讯作者: 张虎(zhanghu@sxu.edu.cn)
  • 基金资助:
    国家社会科学基金(18BYY074);国家自然科学基金(6193601261806117);山西省高等学校科技创新项目(201802012)

Network Representation Learning Method on Fusing Node Structure and Content

ZHANG Hu, ZHOU Jing-jing, GAO Hai-hui, WANG Xin   

  1. School of Computer and Information Technology Shanxi University Taiyuan 030006,China
  • Received:2019-09-09 Revised:2020-04-03 Online:2020-12-15 Published:2020-12-17
  • About author:ZHANG Hu,born in 1979Ph.Dasso-ciate professoris a member of China Computer Federation.His main research interests include Natural Language Processing and representation learning.
  • Supported by:
    National Social Science Fund of China(18BYY074),National Natural Science Foundation of China (61936012,61806117) and Scientific and Technological Innovation Programs of Higher Education Institutions in Shanxi (201802012).

摘要: 随着神经网络技术的快速发展面向复杂网络数据的网络表示学习方法受到越来越多的关注其旨在学习网络中节点的低维度潜在表示并将学习到的特征表示有效应用于基于图的各种分析任务.典型的浅层随机游走网络表示学习模型主要基于节点结构相似和节点内容相似不能同时有效捕获节点结构和内容的相似信息因此在结构和内容等价混合的网络数据上表现较差.为此探索了节点结构相似和节点内容相似的融合特征提出了一种基于无监督浅层神经网络联合学习的表示方法SN2vec.实验分别利用节点结构和内容等价混合的Brazilianair-trafficAmericanair-trafficWikipedia数据集在多标签分类和降维可视化任务上进行验证.结果显示SN2vec在多标签分类任务中的Micro-F1值优于现有的浅层随机游走网络表示方法并且可以较好地学习到潜在结构表示一致的节点.

关键词: 网络表示学习, 随机游走, 复杂网络, 浅层神经网络

Abstract: With the rapid development of neural network Technology the network representation learning method for complex network has got more and more attention.It aims to learn the low-dimensional potential representation of nodes in the network and to apply the learned characteristic representation effectively to various analysis tasks for graph data.The typical shallow random walk network representation model is mainly based on two kinds of characteristic representation methodswhich are the node structure similarity and node content similarity.Howeverthe methods can't effectively capture similar information of node structure and content at the same timeand perform poorly on the network data with the equivalent structure and content.To this endthis paper explores the fusion characteristics of node structure and node contentand proposes a representation method called SN2vecwhich is based on joint learning of unsupervised shallow neural networks.Furtherin order to validate the effectiveness of the proposed modelthis paper respectively conduct the multi-label classification and down-dimensional visualization tasks in Brazilian air-trafficAmerican air-trafficand Wikipedia datasets.The results show that the Micro-F1 of using SN2vec in multi-label classification task is better than the existing shallow random walk network representation methodsand SN2vec can also learn better potential structural representation of consistent nodes.

Key words: Network representation learning, Random walk, Complex network, Shallow neural network

中图分类号: 

  • TP391
[1] GOYAL P,FERRARA E.Graph Embedding Techniques,Applications,and Performance:A Survey[J].Knowledge-Based Systems,2017,2(17):155-164.
[2] TANG L,LIU H.Leveraging social media networks for classification[J].Data Mining and Knowledge Discovery,2011,23(3):447-478.
[3] HAMILTON W L,YING R,LESKOVEC J.Representationlearning on graphs:Methods and applications[J].arXiv,2017:1709.05584.
[4] BENGIO Y,COURVILLE A,VINCENT P.Representationlearning:A review and new perspectives[J].IEEE TPAMI,2013,35(8):1798-1828.
[5] OU M D,CUI P,PEI J,et al.Asymmetric transitivity preserving graph embedding[C]//22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2016:1105-1114.
[6] PENNINGTON J,SORCHER R,MANNING C D.GloVe:Global vectors for word respresentation[C]//Conference on Empirical Methods in Natural Language Processing.2014.
[7] CAI H Y,ZHENG V W,CHANG K.A Comprehensive Survey of Graph Embedding:Problems,Techniques and Applications[J].arXiv:2017:1709.07604.
[8] WILLIAM L H,YING R,LESKOVEC J.Inductive Representation Learning on Large Graphs[J].arXiv:2017:1706.02216.
[9] BELKIN,MIKHAIL,PARTHA N.Laplacian eigenmaps andspectral techniques for embedding and clustering[J].Advances in Neural Information Processing Systems,2001,14(6):585-591.
[10] WOLD S,ESBENSEN K,GELADI P.Principal component ana-lysis[J].Chemometrics and Intelligent Laboratory Systems,1987,2(1/3):37-52.
[11] JOSEPH B,KRUSKAL,WISH M.Multidimensional scaling[M]//Methods,1978:116.
[12] TU C C,YANG C,LIU Z Y.A summary of network represents learning[J].Chinese Science:Information Science,2017,47(8):980-996.
[13] ZHOU J,CUI G Q,ZHANG Z Y,et al.Graph Neural Net-works:A Review of Methods and Applications[J].arXiv:1812.08434.
[14] WANG D X,CUI P,ZHU W W.Structural Deep Network Embedding[C]//The 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2016:1225-1234.
[15] YANG C,LIU Z Y ZHAO D L,et al.Network representation learning with rich text information[C]//International Joint Conference on Artificial Intelligence (IJCAI).2015:2111-2117.
[16] LI Q H,LI C P,ZHANG J,et al.Survey of Compressed Deep Neural Network[J].Computer Science,2019,46(9):1-14.
[17] MIKOLOV T,SUTSKEVER I,CHEN K,et al.DistributedRepresentations of Words and Phrases and their Compositionality[C]//Annual Conference on Neural Information Processing Systems (NIPS).2013:3111-3119.
[18] PEROZZI B,AL-RFOU R,SKIENA S.DeepWalk:OnlineLearning of Social Representations[C]//The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD).2014:701-710.
[19] TANG T,QU M,WANG M,et al.LINE:Large-scale Information Network Embedding[C]//The 24th InternationalConfe-rence on World Wide Web (WWW).2015:1067-1077.
[20] GROVER A,LESKOVEC J.node2vec:Scalable Feature Learning for Networks[C]//22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2016:855-864.
[21] LEONARDO F,RIBEIRO R,PEDRO H P,et al.struc2vec:Learning Node Representations from Structural Identity[C]//The 23rd ACM SIGKDD International Conference on Know-ledge Discovery and Data Mining(KDD).2017:385-394.
[22] XU H,LIU H,WANG W,et al.NE-FLGC:Network Embed-ding Based on Fusing Local (First-Order) and Global (Second-Order) Network Structure with Node Content[C]//Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD).2018:260-271.
[23] CAO S H,LU W,XU Q K.GraRep:Learning Graph Representations with Global Structural Information[C]//The 24th ACM International Conference on Knowledge Discovery and Data Mining (KDD).2015:1105-1114.
[24] KAZEMI SEYED M,GOEL R,JAIN K,et al.Relational repre-sentation learning for dynamic (knowledge) graphs:a survey[J].arXiv:2019:1905.11485.
[25] Tsinghua University built on open source framework OpenNE[EB/OL].http://tech.ifeng.com/a/20171028/44733568_0.shtml.
[1] 丁钰, 魏浩, 潘志松, 刘鑫. 网络表示学习算法综述[J]. 计算机科学, 2020, 47(9): 52-59.
[2] 杨超, 刘志. 基于TASEP模型的复杂网络级联故障研究[J]. 计算机科学, 2020, 47(9): 265-269.
[3] 张梦月, 胡军, 严冠, 李慧嘉. 基于可见性图网络的中国专利申请关注度分析[J]. 计算机科学, 2020, 47(8): 189-194.
[4] 张清琪, 刘漫丹. 复杂网络社区发现的多目标五行环优化算法[J]. 计算机科学, 2020, 47(8): 284-290.
[5] 王慧, 乐孜纯, 龚轩, 武玉坤, 左浩. 基于特征分类的链路预测方法综述[J]. 计算机科学, 2020, 47(8): 302-312.
[6] 蒋宗礼, 李苗苗, 张津丽. 基于融合元路径图卷积的异质网络表示学习[J]. 计算机科学, 2020, 47(7): 231-235.
[7] 黄易, 申国伟, 赵文波, 郭春. 一种基于漏洞威胁模式的网络表示学习算法[J]. 计算机科学, 2020, 47(7): 292-298.
[8] 董明刚, 弓佳明, 敬超. 基于谱聚类的多目标进化社区发现算法研究[J]. 计算机科学, 2020, 47(6A): 461-466.
[9] 袁榕, 宋玉蓉, 孟繁荣. 一种基于加权网络拓扑权重的链路预测方法[J]. 计算机科学, 2020, 47(5): 265-270.
[10] 马扬, 程光权, 梁星星, 李妍, 杨雨灵, 刘忠. 有向加权网络中的改进SDNE算法[J]. 计算机科学, 2020, 47(4): 233-237.
[11] 顾秋阳, 琚春华, 吴功兴. 融入深度自编码器与网络表示学习的社交网络信息推荐模型[J]. 计算机科学, 2020, 47(11): 101-112.
[12] 阮子瑞,阮中远,沈国江. 基于交通路网的TASEP模型的扩展研究[J]. 计算机科学, 2020, 47(1): 265-269.
[13] 赵磊, 周金和. 基于复杂网络内容场的ICN能效优化策略[J]. 计算机科学, 2019, 46(9): 137-142.
[14] 陈航宇, 李慧嘉. 中国航空复杂网络的结构特征与应用分析[J]. 计算机科学, 2019, 46(6A): 300-304.
[15] 刘晓东, 魏海平, 曹宇. 考虑网络拓扑结构变化的SIRS模型的建立与稳定性分析[J]. 计算机科学, 2019, 46(6A): 375-379.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 伍建辉,黄中祥,李武,吴健辉,彭鑫,张生. 城市道路建设时序决策的鲁棒优化[J]. 计算机科学, 2018, 45(4): 89 -93 .
[2] 史雯隽,武继刚,罗裕春. 针对移动云计算任务迁移的快速高效调度算法[J]. 计算机科学, 2018, 45(4): 94 -99 .
[3] 杨羽琦,章国安,金喜龙. 车载自组织网络中基于车辆密度的双簇头路由协议[J]. 计算机科学, 2018, 45(4): 126 -130 .
[4] 朱淑芹,王文宏,李俊青. 针对基于感知器模型的混沌图像加密算法的选择明文攻击[J]. 计算机科学, 2018, 45(4): 178 -181 .
[5] 锁延锋,王少杰,秦宇,李秋香,丰大军,李京春. 工业控制系统的安全技术与应用研究综述[J]. 计算机科学, 2018, 45(4): 25 -33 .
[6] 邓霞, 常乐, 梁俊斌, 蒋婵. 移动机会网络组播路由的研究进展[J]. 计算机科学, 2018, 45(6): 19 -26 .
[7] 项英倬, 谭菊仙, 韩杰思, 石浩. 图匹配技术研究[J]. 计算机科学, 2018, 45(6): 27 -31 .
[8] 王占兵, 宋伟, 彭智勇, 杨先娣, 崔一辉, 申远. 一种面向密文基因数据的子序列外包查询方法[J]. 计算机科学, 2018, 45(6): 51 -56 .
[9] 吴伟男, 刘建明. 面向低功耗无线传感器网络的动态重传算法[J]. 计算机科学, 2018, 45(6): 96 -99 .
[10] 池凯凯, 林一民, 李燕君, 程珍. 能量捕获传感网中吞吐量最大化的占空比方案[J]. 计算机科学, 2018, 45(6): 100 -104 .