计算机科学 ›› 2021, Vol. 48 ›› Issue (12): 212-218.doi: 10.11896/jsjkx.201000015

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于不完全信息的深度网络表示学习方法

富坤1, 赵晓梦1, 付紫桐2, 高金辉1, 马浩然3   

  1. 1 河北工业大学人工智能与数据科学学院 天津300401
    2 长春理工大学计算机科学技术学院 长春130022
    3 华中科技大学武汉光电国家研究中心 武汉430074
  • 收稿日期:2020-09-30 修回日期:2021-03-02 发布日期:2021-11-26
  • 通讯作者: 富坤(fukun@hebut.edu.cn)
  • 基金资助:
    国家自然科学基金青年科学基金(61806072)

Deep Network Representation Learning Method on Incomplete Information Networks

FU Kun1, ZHAO Xiao-meng1, FU Zi-tong2, GAO Jin-hui1, MA Hao-ran3   

  1. 1 School of Artificial Intelligence,Hebei University of Technology,Tianjin 300401,China
    2 School of Computer Sciences and Technology,Changchun University of Science and Technology,Changchun 130022,China
    3 Wuhan National Laboratory for Optoelectronics,Huazhong University of Science and Technology,Wuhan 430074,China
  • Received:2020-09-30 Revised:2021-03-02 Published:2021-11-26
  • About author:FU Kun,born in 1979,Ph.D,associate professor.Her main research interests include social network analysis and network representation learning.
  • Supported by:
    Young Scientists Fund of the National Natural Science Foundation of China(61806072).

摘要: 网络表示学习的目标是将网络中的节点嵌入到低维的向量空间,为下游任务提供有效特征表示。在现实场景中,大规模网络通常具有不完整的链路,而现有的大多数网络表示学习模型都是在网络是完整的假设下设计的,因此其性能很容易受到链路缺失的影响。针对该问题,文中提出了一种基于不完全信息的深度网络表示学习方法DNRL(Deep Network Representation Learning)。首先采用转移概率矩阵将结构信息和属性信息进行动态融合,弥补了结构信息不完整带来的过大损失,然后采用一种具有强大特征提取能力的深度生成模型(变分自编码器)来学习节点的低维表示,并捕获网络数据中潜在的高非线性特征。在3个真实属性网络上的实验结果表明,与当前常用的网络表示学习模型相比,所提模型在不同程度链路缺失的节点分类任务中都明显地改善了分类效果,在可视化任务中更清晰地反映了节点的团簇关系。

关键词: 网络表示学习, 属性网络, 不完全信息, 变分自编码器

Abstract: The goal of network representation learning(NRL) is embedding network nodes into low-dimensional vector space,for effective feature representation of the downstream tasks.Due to the difficulty of information collection in the real-world scene-ries,large-scale networks often meet missing links between nodes.However,the most existing NRL models are designed on the foundation of complete information networks and that causes the poor robustness in incomplete networks.To solve this problem,a deep network representation learning(DNRL) method based on incomplete information networks is proposed.Firstly,a transfer probability matrix is used to dynamically mix the structural information and attribute information to cover the excessive loss caused by incomplete structural information.Then,a deep generative model variational autoencoder with powerful feature extraction capability is used to learn low-dimensional representation of nodes,and capture the potential high nonlinear features of nodes.Compared with the commonly used network representation learning methods,the experimental results on three real attri-bute networks show that the proposed model obviously improve effect in the node classification task with different degrees of link missing,visualization results clearly demonstrate the cluster relationship of nodes.

Key words: Network representation learning, Attribute network, Incomplete information, Variational autoencoder

中图分类号: 

  • TP391
[1]TU C C,YANG C,LIU Z Y,et al.Network representation learning:an overview[J].Scientia Sinica Information,2017,47(8):980-996.
[2]ZHANG D K,YIN J,ZHU X Q,et al.Network representation learning:A survey[J].IEEE Transactions on Big Data,2017(99):1.
[3]DING Y,WEI H,PAN Z S,et al.Survey of Network Representation Learning[J].Computer Science,2020,47(9):52-59.
[4]TANG J,AGGARWAL C,LIU H.Node classification signed social networks[C]//Proceedings of the 2016 SIAM International Conference on Data Mining.Society for Industrial and Applied Mathematics,2016:54-62.
[5]TANG J,LIU J,ZHANG M,et al.Visualizing Large-scale and High-dimensional Data[C]//Proceedings of the 25th International Conference on World Wide Web.2016:287-297.
[6]WANG H,ZHANG F,HOU M,et al.Shine:Signed heteroge- neous information network embedding for sentiment link prediction[C]//Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining.2018:592-600.
[7]HAN N,QIAO S J,YUAN C A,et al.A Fast Parallel Community Detection Algorithm for Mobile Social Networks[J].Journal of Chongqing University of Technology(Natural Science),2020,34(1):94-102.
[8]YIN Y,JI L X,HUANG R Y,et al.Research and development of network representation learning[J].Chinese Journal of Network and Information Security,2019,5(2):77-87.
[9]WANG J.Locally linear embedding [M]//Geometric Structure of High-Dimensional Data and Dimensionality Reduction.Berlin:Springer,2012:203-220.
[10]BELKIN M,NIYOGI P.Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering[C]//Advances in Neural Information Processing Systems.2002:585-591.
[11]PEROZZI B,AI-RFOU R,SKIENA S.DeepWalk:Online Lear- ning of Social Representations[C]//The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mi-ning.ACM,2014:701-710.
[12]GROVER A,LESKOVEC J.Node2vec:Scalable Feature Lear- ning for Networks[C]//The 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2016:855-864.
[13]CAO S S,LU W,XU Q K.Deep neural networks for learning graph representations[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence.AAAI Press,2016:1145-1152.
[14]WANG D X,CUI P,ZHU W W.Structural deep network embedding[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM Press,2016:1225-1234.
[15]YANG C,LIU Z Y,ZHAO D L,et al.Network representation learning with rich text information[C]//Proceedings of International Joint Conference on Artificial Intelligence.2015:2111-2117.
[16]HUANG X,LI J D,HU X.Accelerated attributed network embedding[C]//Proceedings of the 2017 SIAM International Conference on Data Mining.Philadelphia,PA:SIAM,2017:633-641.
[17]LIAO L Z,HE X N,ZHANG H W,et al.Attributed social network embedding[J].IEEE Transactions on Knowledge and Data Engineering,2018,30(12):2257-2270.
[18]SMITH J A,MOODY J,MORGAN J H.Network sampling coverage II:The effect of non-random missing data on network measurement[J].Social Networks,2017,48:78-99.
[19]TANG J,QU M,WANG M,et al.Line:large-scale information network embedding[C]//Proceedings of the 24th International Conference on World Wide Web.New York:ACM,2015:1067-1077.
[20]CAO S S,LU W,XU Q K.Grarep:Learning graph representations with global structural information[C]//Proceedings of the 24th ACM International Conference on Information and Know-ledge Management.New York:ACM,2015:891-900.
[21]LIANG J,JACOBS P,SUN J,et al.Semi-supervised embedding in attributed networks with outliers[C]//Proceedings of the 2018 SIAM International Conference on Data Mining.Philadelphia:SIAM,2018:153-161.
[22]BANDYOPADHYAY S,LOKESH N,MURTY M N.Outlier Aware Network Embedding for Attributed Networks[J].Proceedings of the AAAI Conference on Artificial Intelligence,2019,33:12-19.
[23]ZHANG D K,YIN J,ZHU X Q,et al.SINE:Scalable Incomplete Network Embedding[C]//IEEE International Conference on Data Mining.Washington DC,USA:IEEE Computer Society,2018:737-746.
[24]HOU C,HE S,TANG K.Attributed Network Embedding for Incomplete Structure Information[J].arXiv:1811.11728,2018.
[25]BOUMA G.Normalized(pointwise) mutual information in collocation extraction[J].Proceedings of GSCL,2009,30:31-40.
[26]YUAN F N,ZHANG L,SHI J T,et al.Theories and Application of Auto-Encoder Neural Networks:A Literature Survey[J].Chinese Journal of Computers,2019,42(1):203-230.
[27]REZENDE D J,MOHAMED S.Variational inference with normalizing flows[C]//International Conference on Machine Learning.PMLR,2015:1530-1538.
[28]LI Y,PAN Q,WANG S H,et al.Disentangled Variational Auto-Encoder for semi-supervised learning[J].Information Sciences,2019,482(5):73-85.
[29]YANG D J,WANG S Z,LI C Z,et al.From properties to links:Deep network embedding on incomplete graphs[C]//Procee-dings of the 2017 ACM on Conference on Information and Knowledge Management.ACM,2017:367-376.
[1] 李少辉, 张国敏, 宋丽华, 王秀磊. 基于不完全信息博弈的反指纹识别分析[J]. 计算机科学, 2021, 48(8): 291-299.
[2] 张仁杰, 陈伟, 杭梦鑫, 吴礼发. 基于变分自编码器的不平衡样本异常流量检测[J]. 计算机科学, 2021, 48(7): 62-69.
[3] 徐新黎, 肖云月, 龙海霞, 杨旭华, 毛剑飞. 基于矩阵分解的属性网络嵌入和社区发现算法[J]. 计算机科学, 2021, 48(12): 204-211.
[4] 潘雨, 邹军华, 王帅辉, 胡谷雨, 潘志松. 基于网络表示学习的深度社团发现方法[J]. 计算机科学, 2021, 48(11A): 198-203.
[5] 赵曼, 赵加坤, 刘金诺. 基于自我中心网络结构特征和网络表示学习的链路预测算法[J]. 计算机科学, 2021, 48(11A): 211-217.
[6] 丁钰, 魏浩, 潘志松, 刘鑫. 网络表示学习算法综述[J]. 计算机科学, 2020, 47(9): 52-59.
[7] 蒋宗礼, 李苗苗, 张津丽. 基于融合元路径图卷积的异质网络表示学习[J]. 计算机科学, 2020, 47(7): 231-235.
[8] 黄易, 申国伟, 赵文波, 郭春. 一种基于漏洞威胁模式的网络表示学习算法[J]. 计算机科学, 2020, 47(7): 292-298.
[9] 蹇松雷, 卢凯. 复杂异构数据的表征学习综述[J]. 计算机科学, 2020, 47(2): 1-9.
[10] 张虎, 周晶晶, 高海慧, 王鑫. 融合节点结构和内容的网络表示学习方法[J]. 计算机科学, 2020, 47(12): 119-124.
[11] 顾秋阳, 琚春华, 吴功兴. 融入深度自编码器与网络表示学习的社交网络信息推荐模型[J]. 计算机科学, 2020, 47(11): 101-112.
[12] 冶忠林, 赵海兴, 张科, 朱宇. 基于多视图集成的网络表示学习算法[J]. 计算机科学, 2019, 46(1): 117-125.
[13] 张昱, 高克宁, 于戈. 一种融合节点属性信息的社会网络链接预测方法[J]. 计算机科学, 2018, 45(6): 41-45.
[14] 殷丽凤,郝忠孝. 存在XML强多值依赖的XML Schema规范化研究[J]. 计算机科学, 2010, 37(1): 192-196.
[15] 殷丽凤,郝忠孝. 不完全信息环境下XML Schema规范化研究[J]. 计算机科学, 2009, 36(10): 183-188.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 潘孝勤, 芦天亮, 杜彦辉, 仝鑫. 基于深度学习的语音合成与转换技术综述[J]. 计算机科学, 2021, 48(8): 200 -208 .
[2] 王俊, 王修来, 庞威, 赵鸿飞. 面向科技前瞻预测的大数据治理研究[J]. 计算机科学, 2021, 48(9): 36 -42 .
[3] 余力, 杜启翰, 岳博妍, 向君瑶, 徐冠宇, 冷友方. 基于强化学习的推荐研究综述[J]. 计算机科学, 2021, 48(10): 1 -18 .
[4] 王梓强, 胡晓光, 李晓筱, 杜卓群. 移动机器人全局路径规划算法综述[J]. 计算机科学, 2021, 48(10): 19 -29 .
[5] 高洪皓, 郑子彬, 殷昱煜, 丁勇. 区块链技术专题序言[J]. 计算机科学, 2021, 48(11): 1 -3 .
[6] 毛瀚宇, 聂铁铮, 申德荣, 于戈, 徐石成, 何光宇. 区块链即服务平台关键技术及发展综述[J]. 计算机科学, 2021, 48(11): 4 -11 .
[7] 张倩, 肖丽. 基于流线的流场可视化绘制方法综述[J]. 计算机科学, 2021, 48(12): 1 -7 .
[8] 王焘, 张树东, 李安, 邵亚茹, 张文博. 一种面向异常传播的微服务故障诊断方法[J]. 计算机科学, 2021, 48(12): 8 -16 .
[9] 江郑, 王俊丽, 曹芮浩, 闫春钢. 一种基于微服务架构的服务划分方法[J]. 计算机科学, 2021, 48(12): 17 -23 .
[10] 胡蓉, 阳王东, 王昊天, 罗辉章, 李肯立. 基于GPU加速的并行WMD算法[J]. 计算机科学, 2021, 48(12): 24 -28 .