计算机科学 ›› 2021, Vol. 48 ›› Issue (12): 212-218.doi: 10.11896/jsjkx.201000015

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于不完全信息的深度网络表示学习方法

富坤1, 赵晓梦1, 付紫桐2, 高金辉1, 马浩然3   

  1. 1 河北工业大学人工智能与数据科学学院 天津300401
    2 长春理工大学计算机科学技术学院 长春130022
    3 华中科技大学武汉光电国家研究中心 武汉430074
  • 收稿日期:2020-09-30 修回日期:2021-03-02 出版日期:2021-12-15 发布日期:2021-11-26
  • 通讯作者: 富坤(fukun@hebut.edu.cn)
  • 基金资助:
    国家自然科学基金青年科学基金(61806072)

Deep Network Representation Learning Method on Incomplete Information Networks

FU Kun1, ZHAO Xiao-meng1, FU Zi-tong2, GAO Jin-hui1, MA Hao-ran3   

  1. 1 School of Artificial Intelligence,Hebei University of Technology,Tianjin 300401,China
    2 School of Computer Sciences and Technology,Changchun University of Science and Technology,Changchun 130022,China
    3 Wuhan National Laboratory for Optoelectronics,Huazhong University of Science and Technology,Wuhan 430074,China
  • Received:2020-09-30 Revised:2021-03-02 Online:2021-12-15 Published:2021-11-26
  • About author:FU Kun,born in 1979,Ph.D,associate professor.Her main research interests include social network analysis and network representation learning.
  • Supported by:
    Young Scientists Fund of the National Natural Science Foundation of China(61806072).

摘要: 网络表示学习的目标是将网络中的节点嵌入到低维的向量空间,为下游任务提供有效特征表示。在现实场景中,大规模网络通常具有不完整的链路,而现有的大多数网络表示学习模型都是在网络是完整的假设下设计的,因此其性能很容易受到链路缺失的影响。针对该问题,文中提出了一种基于不完全信息的深度网络表示学习方法DNRL(Deep Network Representation Learning)。首先采用转移概率矩阵将结构信息和属性信息进行动态融合,弥补了结构信息不完整带来的过大损失,然后采用一种具有强大特征提取能力的深度生成模型(变分自编码器)来学习节点的低维表示,并捕获网络数据中潜在的高非线性特征。在3个真实属性网络上的实验结果表明,与当前常用的网络表示学习模型相比,所提模型在不同程度链路缺失的节点分类任务中都明显地改善了分类效果,在可视化任务中更清晰地反映了节点的团簇关系。

关键词: 变分自编码器, 不完全信息, 网络表示学习, 属性网络

Abstract: The goal of network representation learning(NRL) is embedding network nodes into low-dimensional vector space,for effective feature representation of the downstream tasks.Due to the difficulty of information collection in the real-world scene-ries,large-scale networks often meet missing links between nodes.However,the most existing NRL models are designed on the foundation of complete information networks and that causes the poor robustness in incomplete networks.To solve this problem,a deep network representation learning(DNRL) method based on incomplete information networks is proposed.Firstly,a transfer probability matrix is used to dynamically mix the structural information and attribute information to cover the excessive loss caused by incomplete structural information.Then,a deep generative model variational autoencoder with powerful feature extraction capability is used to learn low-dimensional representation of nodes,and capture the potential high nonlinear features of nodes.Compared with the commonly used network representation learning methods,the experimental results on three real attri-bute networks show that the proposed model obviously improve effect in the node classification task with different degrees of link missing,visualization results clearly demonstrate the cluster relationship of nodes.

Key words: Attribute network, Incomplete information, Network representation learning, Variational autoencoder

中图分类号: 

  • TP391
[1]TU C C,YANG C,LIU Z Y,et al.Network representation learning:an overview[J].Scientia Sinica Information,2017,47(8):980-996.
[2]ZHANG D K,YIN J,ZHU X Q,et al.Network representation learning:A survey[J].IEEE Transactions on Big Data,2017(99):1.
[3]DING Y,WEI H,PAN Z S,et al.Survey of Network Representation Learning[J].Computer Science,2020,47(9):52-59.
[4]TANG J,AGGARWAL C,LIU H.Node classification signed social networks[C]//Proceedings of the 2016 SIAM International Conference on Data Mining.Society for Industrial and Applied Mathematics,2016:54-62.
[5]TANG J,LIU J,ZHANG M,et al.Visualizing Large-scale and High-dimensional Data[C]//Proceedings of the 25th International Conference on World Wide Web.2016:287-297.
[6]WANG H,ZHANG F,HOU M,et al.Shine:Signed heteroge- neous information network embedding for sentiment link prediction[C]//Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining.2018:592-600.
[7]HAN N,QIAO S J,YUAN C A,et al.A Fast Parallel Community Detection Algorithm for Mobile Social Networks[J].Journal of Chongqing University of Technology(Natural Science),2020,34(1):94-102.
[8]YIN Y,JI L X,HUANG R Y,et al.Research and development of network representation learning[J].Chinese Journal of Network and Information Security,2019,5(2):77-87.
[9]WANG J.Locally linear embedding [M]//Geometric Structure of High-Dimensional Data and Dimensionality Reduction.Berlin:Springer,2012:203-220.
[10]BELKIN M,NIYOGI P.Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering[C]//Advances in Neural Information Processing Systems.2002:585-591.
[11]PEROZZI B,AI-RFOU R,SKIENA S.DeepWalk:Online Lear- ning of Social Representations[C]//The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mi-ning.ACM,2014:701-710.
[12]GROVER A,LESKOVEC J.Node2vec:Scalable Feature Lear- ning for Networks[C]//The 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2016:855-864.
[13]CAO S S,LU W,XU Q K.Deep neural networks for learning graph representations[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence.AAAI Press,2016:1145-1152.
[14]WANG D X,CUI P,ZHU W W.Structural deep network embedding[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM Press,2016:1225-1234.
[15]YANG C,LIU Z Y,ZHAO D L,et al.Network representation learning with rich text information[C]//Proceedings of International Joint Conference on Artificial Intelligence.2015:2111-2117.
[16]HUANG X,LI J D,HU X.Accelerated attributed network embedding[C]//Proceedings of the 2017 SIAM International Conference on Data Mining.Philadelphia,PA:SIAM,2017:633-641.
[17]LIAO L Z,HE X N,ZHANG H W,et al.Attributed social network embedding[J].IEEE Transactions on Knowledge and Data Engineering,2018,30(12):2257-2270.
[18]SMITH J A,MOODY J,MORGAN J H.Network sampling coverage II:The effect of non-random missing data on network measurement[J].Social Networks,2017,48:78-99.
[19]TANG J,QU M,WANG M,et al.Line:large-scale information network embedding[C]//Proceedings of the 24th International Conference on World Wide Web.New York:ACM,2015:1067-1077.
[20]CAO S S,LU W,XU Q K.Grarep:Learning graph representations with global structural information[C]//Proceedings of the 24th ACM International Conference on Information and Know-ledge Management.New York:ACM,2015:891-900.
[21]LIANG J,JACOBS P,SUN J,et al.Semi-supervised embedding in attributed networks with outliers[C]//Proceedings of the 2018 SIAM International Conference on Data Mining.Philadelphia:SIAM,2018:153-161.
[22]BANDYOPADHYAY S,LOKESH N,MURTY M N.Outlier Aware Network Embedding for Attributed Networks[J].Proceedings of the AAAI Conference on Artificial Intelligence,2019,33:12-19.
[23]ZHANG D K,YIN J,ZHU X Q,et al.SINE:Scalable Incomplete Network Embedding[C]//IEEE International Conference on Data Mining.Washington DC,USA:IEEE Computer Society,2018:737-746.
[24]HOU C,HE S,TANG K.Attributed Network Embedding for Incomplete Structure Information[J].arXiv:1811.11728,2018.
[25]BOUMA G.Normalized(pointwise) mutual information in collocation extraction[J].Proceedings of GSCL,2009,30:31-40.
[26]YUAN F N,ZHANG L,SHI J T,et al.Theories and Application of Auto-Encoder Neural Networks:A Literature Survey[J].Chinese Journal of Computers,2019,42(1):203-230.
[27]REZENDE D J,MOHAMED S.Variational inference with normalizing flows[C]//International Conference on Machine Learning.PMLR,2015:1530-1538.
[28]LI Y,PAN Q,WANG S H,et al.Disentangled Variational Auto-Encoder for semi-supervised learning[J].Information Sciences,2019,482(5):73-85.
[29]YANG D J,WANG S Z,LI C Z,et al.From properties to links:Deep network embedding on incomplete graphs[C]//Procee-dings of the 2017 ACM on Conference on Information and Knowledge Management.ACM,2017:367-376.
[1] 王冠宇, 钟婷, 冯宇, 周帆.
基于矢量量化编码的协同过滤推荐方法
Collaborative Filtering Recommendation Method Based on Vector Quantization Coding
计算机科学, 2022, 49(9): 48-54. https://doi.org/10.11896/jsjkx.210700109
[2] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[3] 唐雨潇, 王斌君.
基于深度生成模型的人脸编辑研究进展
Research Progress of Face Editing Based on Deep Generative Model
计算机科学, 2022, 49(2): 51-61. https://doi.org/10.11896/jsjkx.210400108
[4] 蒋宗礼, 樊珂, 张津丽.
基于生成对抗网络和元路径的异质网络表示学习
Generative Adversarial Network and Meta-path Based Heterogeneous Network Representation Learning
计算机科学, 2022, 49(1): 133-139. https://doi.org/10.11896/jsjkx.201000179
[5] 李少辉, 张国敏, 宋丽华, 王秀磊.
基于不完全信息博弈的反指纹识别分析
Incomplete Information Game Theoretic Analysis to Defend Fingerprinting
计算机科学, 2021, 48(8): 291-299. https://doi.org/10.11896/jsjkx.210100148
[6] 张仁杰, 陈伟, 杭梦鑫, 吴礼发.
基于变分自编码器的不平衡样本异常流量检测
Detection of Abnormal Flow of Imbalanced Samples Based on Variational Autoencoder
计算机科学, 2021, 48(7): 62-69. https://doi.org/10.11896/jsjkx.200600022
[7] 徐新黎, 肖云月, 龙海霞, 杨旭华, 毛剑飞.
基于矩阵分解的属性网络嵌入和社区发现算法
Attributed Network Embedding Based on Matrix Factorization and Community Detection
计算机科学, 2021, 48(12): 204-211. https://doi.org/10.11896/jsjkx.210300060
[8] 潘雨, 邹军华, 王帅辉, 胡谷雨, 潘志松.
基于网络表示学习的深度社团发现方法
Deep Community Detection Algorithm Based on Network Representation Learning
计算机科学, 2021, 48(11A): 198-203. https://doi.org/10.11896/jsjkx.210200113
[9] 赵曼, 赵加坤, 刘金诺.
基于自我中心网络结构特征和网络表示学习的链路预测算法
Link Prediction Algorithm Based on Ego Networks Structure and Network Representation Learning
计算机科学, 2021, 48(11A): 211-217. https://doi.org/10.11896/jsjkx.201200231
[10] 丁钰, 魏浩, 潘志松, 刘鑫.
网络表示学习算法综述
Survey of Network Representation Learning
计算机科学, 2020, 47(9): 52-59. https://doi.org/10.11896/jsjkx.190300004
[11] 黄易, 申国伟, 赵文波, 郭春.
一种基于漏洞威胁模式的网络表示学习算法
Network Representation Learning Algorithm Based on Vulnerability Threat Schema
计算机科学, 2020, 47(7): 292-298. https://doi.org/10.11896/jsjkx.190600156
[12] 蒋宗礼, 李苗苗, 张津丽.
基于融合元路径图卷积的异质网络表示学习
Graph Convolution of Fusion Meta-path Based Heterogeneous Network Representation Learning
计算机科学, 2020, 47(7): 231-235. https://doi.org/10.11896/jsjkx.190600085
[13] 蹇松雷, 卢凯.
复杂异构数据的表征学习综述
Survey on Representation Learning of Complex Heterogeneous Data
计算机科学, 2020, 47(2): 1-9. https://doi.org/10.11896/jsjkx.190600180
[14] 张虎, 周晶晶, 高海慧, 王鑫.
融合节点结构和内容的网络表示学习方法
Network Representation Learning Method on Fusing Node Structure and Content
计算机科学, 2020, 47(12): 119-124. https://doi.org/10.11896/jsjkx.190900027
[15] 顾秋阳, 琚春华, 吴功兴.
融入深度自编码器与网络表示学习的社交网络信息推荐模型
Social Network Information Recommendation Model Combining Deep Autoencoder and Network Representation Learning
计算机科学, 2020, 47(11): 101-112. https://doi.org/10.11896/jsjkx.200400120
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!