计算机科学 ›› 2021, Vol. 48 ›› Issue (9): 68-76.doi: 10.11896/jsjkx.210500203

所属专题: 智能数据治理技术与系统

• 智能数据治理技术与系统* 上一篇    下一篇

融合不完整多视图的异质信息网络嵌入方法

郑苏苏, 关东海, 袁伟伟   

  1. 南京航空航天大学计算机科学与技术学院 南京211106软件新技术与产业化协同创新中心 南京211106
  • 收稿日期:2021-05-28 修回日期:2021-06-24 出版日期:2021-09-15 发布日期:2021-09-10
  • 通讯作者: 袁伟伟(yuanweiwei@nuaa.edu.cn)
  • 作者简介:zhengsusu@nuaa.edu.cn
  • 基金资助:
    江苏省重点研发计划项目(BE2019012);国家自然科学基金委员会-中国民用航空局民航联合研究基金项目(U2033202)

Heterogeneous Information Network Embedding with Incomplete Multi-view Fusion

ZHENG Su-su, GUAN Dong-hai, YUAN Wei-wei   

  1. College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,ChinaCollaborative Innovation Center of Novel Software Technology and Industrialization,Nanjing 211106,China
  • Received:2021-05-28 Revised:2021-06-24 Online:2021-09-15 Published:2021-09-10
  • About author:ZHENG Su-su,born in 1993,postgra-duate.Her main research interests include network representation,data mi-ning and complex network analysis.
    YUAN Wei-wei,born in 1981,Ph.D,associate professor.Her main research interests include machine learning,pattern recognition,social computing and recommender systems.
  • Supported by:
    Key Research and Development Program of Jiangsu Province(BE2019012) and Joint Fund of National Natural Science Foundation of China and Civil Aviation Administration of China(U2033202).

摘要: 异质信息网络(Heterogeneous Information Network,HIN)嵌入将复杂的异质信息映射到低维稠密的向量空间,有利于网络数据的计算和存储。现有的基于多视图的HIN嵌入方法考虑了节点之间的多种语义关系,但忽略了视图的不完整性。大多数视图存在数据缺失,直接融合多个不完整的视图会导致嵌入效果不佳。为此,文中提出了一种融合不完整多视图的HIN嵌入方法(Incomplete Multi-view Fusion Based HIN Embedding,IMHE)。IMHE的关键思想是聚合其他视图的邻居以重建不完整的视图。由于不同的单视图描述的是同一个网络,因此其他视图中的邻居可以一定程度上恢复不完整视图的结构信息。IMHE首先在不同视图中生成节点序列,并利用多头注意力方法学习单视图嵌入。对于每个不完整视图,IMHE在其他视图中找到缺失节点的k阶邻居,然后将不完整视图中邻居的单视图嵌入聚合在一起,为缺失节点生成新的嵌入。最后使用多视图典型相关性分析方法获得节点的统一嵌入,同时提取多个视图的隐藏语义关系。在3个真实数据集上的实验结果表明,相比现有研究,该方法的嵌入性能有显著提升。

关键词: 不完整视图, 多视图融合, 网络嵌入, 异质信息网络

Abstract: Heterogeneous information network (HIN) embedding maps complex heterogeneous information to a low-dimensional dense vector space,which is conducive to the calculation and storage of network data.Most existing multi-view-based HIN embedding methods consider multiple semantic relationships between nodes,but ignore the incompleteness of the view.Most of views are incomplete and directly fusing multiple incomplete views will affect the performances of the embedding model.To address this problem,we propose a novel HIN embedding model with incomplete multi-view fusion,named IMHE.The key idea of IMHE is to aggregate neighbors of other views to reconstruct the incomplete views.Since different views describe the same HIN,neighbors in other views can restore the structure information of the missing nodes.The IMHE model first generates nodes sequences in different views,and leverages the multi-head self-attention method to obtain single-view embedding.For each incomplete view,IMHE finds the k-order neighbors of the missing nodes in other views,then aggregates the embeddings of neighbors in the incomplete view to generate new embeddings for missing nodes.IMHE finally uses the multi-view canonical correlation analysis method to obtain the joint embedding of nodes,thereby simultaneously extracting the hidden semantic relationship of multiple views.Experiment results on three real-world datasets show that the proposed method is superior to the state-of-the-art methods.

Key words: Heterogeneous information network, Incomplete view, Multi-view fusion, Network embedding

中图分类号: 

  • TP391
[1]DONG Y,HU Z,WANG K,et al.Heterogeneous network representation learning[C]//Proceedings of the 29th International Joint Conference on Artificial Intelligence.2020:4861-4867.
[2]WANG X,BO D,SHI C,et al.A Survey on HeterogeneousGraph Embedding:Methods,Techniques,Applications and Sources[J].arXi:2011.14867,2020.
[3]DENG H,HAN J,ZHAO B,et al.Probabilistic topic modelswith biased propagation on heterogeneous information networks[C]//Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2011:1271-1279.
[4]LI Z,JIANG J Y,SUN Y,et al.Personalized question routing via heterogeneous network embedding[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019,33:192-199.
[5]HU X T,SHA C F,LIU Y J.Post-processing Network Embedding Algorithm with Random Projection and Principal Component Analysis[J].Computer Science,2021,48(5):124-129.
[6]ZHAO K,BAI T,WU B,et al.Deep Adversarial Completion for Sparse Heterogeneous Information Network Embedding[C]//Proceedings of The Web Conference 2020.2020:508-518.
[7]YANG D J,WANG S Z,LI C Z,et al.From Properties to Links:Deep Network Embedding on Incomplete Graphs[C]//CIKM.2017:367-376.
[8]LIN Y,GOU Y,LIU Z,et al.COMPLETER:Incomplete multi-view clustering via contrastive prediction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:11174-11183.
[9]HE Y,SONG Y,LI J,et al.HeteSpaceyWalk:a heterogeneous spacey random walk for heterogeneous information network embedding[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management.2019:639-648.
[10]DONG Y,CHAWLA N V,SWAMI A.metapath2vec:Scalable representation learning for heterogeneous networks[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2017:135-144.
[11]HOSSEINI A,CHEN T,WU W,et al.Heteromed:Heteroge-neous information network for medical diagnosis[C]//Proceedings of the 27th ACM International Conference on Information and Knowledge Management.2018:763-772.
[12]HU B,SHI C,ZHAO W X,et al.Leveraging meta-path based context for top-n recommendation with a neural co-attention model[C]//Proceedings of the 24th ACM SIGKDD Internatio-nal Conference on Knowledge Discovery & Data Mining.2018:1531-1540.
[13]FU X,ZHANG J,MENG Z,et al.MAGNN:Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding[C]//Proceedings of The Web Conference 2020.2020:2331-2341.
[14]FU T,LEE W C,LEI Z.Hin2vec:Explore meta-paths in heterogeneous information networks for representation learning[C]//Proceedings of the 2017 ACM on Conference on Information and Knowledge Management.2017:1797-1806.
[15]SHI R,LIANG T,PENG H,et al.HEAM:Heterogeneous Network Embedding with Automatic Meta-path Construction[C]//International Conference on Knowledge Science,Engineering and Management.Springer,Cham,2020:304-315.
[16]TANG J,QU M,MEI Q.Pte:Predictive text embeddingthrough large-scale heterogeneous text networks[C]//Procee-dings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2015:1165-1174.
[17]XU L,WEI X,CAO J,et al.Embedding of embedding (EOE) joint embedding for coupled heterogeneous networks[C]//Proceedings of the Tenth ACM International Conference on Web Search and Data Mining.2017:741-749.
[18]SHAPIRA T,SHAVITT Y.A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding[C]//Proceedings of the Workshop on Network Meets AI & ML.2020:35-41.
[19]WANG X,JI H,SHI C,et al.Heterogeneous graph attention network[C]//The World Wide Web Conference.2019:2022-2032.
[20]WANG L,GAO C,HUANG C,et al.Embedding heterogeneous networks into hyperbolic space without meta-path[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021.
[21]JIANG J Y,LI Z,JU C J T,et al.MARU:Meta-context Aware Random Walks for Heterogeneous Network Representation Learning[C]//Proceedings of the 29th ACM International Conference on Information & Knowledge Management.2020:575-584.
[22]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[J].arXiv:1706.03762,2017.
[23]YU W J,DING S F.Conditional Generative Adversarial Network Based on Self-attention Mechanism[J].Computer Science,2021,48(1):241-246.
[24]VOITA E,TALBOT D,MOISEEV F,et al.Analyzing multi-head self-attention:Specialized heads do the heavy lifting,the rest can be pruned[J].arXiv:1905.09418,2019.
[25]BANSAL T,JUAN D C,RAVI S,et al.A2N:attending to neighbors for knowledge graph inference[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:4387-4392.
[26]SUN Z,SARMA P,SETHARES W,et al.Learning relation-ships between text,audio,and video via deep canonical correlation for multimodal language analysis[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:8992-8999.
[27]GROVER A,LESKOVEC J.node2vec:Scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mi-ning.2016:855-864.
[28]TANG J,QU M,WANG M,et al.Line:Large-scale information
network embedding[C]//Proceedings of the 24th International Conference on World Wide Web.2015:1067-1077.
[29]ZHANG H,QIU L,YI L,et al.Scalable Multiplex NetworkEmbedding[C]//IJCAI.2018,18:3082-3088.
[30]YUAN C,YANG H.Research on K-value selection method of K-means clustering algorithm[J].J-Multidisciplinary Scientific Journal,2019,2(2):226-235.
[1] 吕晓锋, 赵书良, 高恒达, 武永亮, 张宝奇.
基于异质信息网的短文本特征扩充方法
Short Texts Feautre Enrichment Method Based on Heterogeneous Information Network
计算机科学, 2022, 49(9): 92-100. https://doi.org/10.11896/jsjkx.210700241
[2] 杜航原, 李铎, 王文剑.
一种面向电商网络的异常用户检测方法
Method for Abnormal Users Detection Oriented to E-commerce Network
计算机科学, 2022, 49(7): 170-178. https://doi.org/10.11896/jsjkx.210600092
[3] 陈世聪, 袁得嵛, 黄淑华, 杨明.
基于结构深度网络嵌入模型的节点标签分类算法
Node Label Classification Algorithm Based on Structural Depth Network Embedding Model
计算机科学, 2022, 49(3): 105-112. https://doi.org/10.11896/jsjkx.201000177
[4] 郭磊, 马廷淮.
基于好友亲密度的用户匹配
Friend Closeness Based User Matching
计算机科学, 2022, 49(3): 113-120. https://doi.org/10.11896/jsjkx.210200137
[5] 杨旭华, 王磊, 叶蕾, 张端, 周艳波, 龙海霞.
基于节点相似性和网络嵌入的复杂网络社区发现算法
Complex Network Community Detection Algorithm Based on Node Similarity and Network Embedding
计算机科学, 2022, 49(3): 121-128. https://doi.org/10.11896/jsjkx.210200009
[6] 蒋宗礼, 樊珂, 张津丽.
基于生成对抗网络和元路径的异质网络表示学习
Generative Adversarial Network and Meta-path Based Heterogeneous Network Representation Learning
计算机科学, 2022, 49(1): 133-139. https://doi.org/10.11896/jsjkx.201000179
[7] 赵金龙, 赵中英.
基于异质信息网络表示学习与注意力神经网络的推荐算法
Recommendation Algorithm Based on Heterogeneous Information Network Embedding and Attention Neural Network
计算机科学, 2021, 48(8): 72-79. https://doi.org/10.11896/jsjkx.200800226
[8] 胡昕彤, 沙朝锋, 刘艳君.
基于随机投影和主成分分析的网络嵌入后处理算法
Post-processing Network Embedding Algorithm with Random Projection and Principal Component Analysis
计算机科学, 2021, 48(5): 124-129. https://doi.org/10.11896/jsjkx.200500058
[9] 杨旭华, 王晨.
基于网络嵌入与局部合力的复杂网络社区划分算法
Community Detection Algorithm in Complex Network Based on Network Embedding and Local Resultant Force
计算机科学, 2021, 48(4): 229-236. https://doi.org/10.11896/jsjkx.200200102
[10] 张健雄, 宋坤, 何鹏, 李兵.
基于图神经网络的软件系统中关键类的识别
Identification of Key Classes in Software Systems Based on Graph Neural Networks
计算机科学, 2021, 48(12): 149-158. https://doi.org/10.11896/jsjkx.210100200
[11] 徐新黎, 肖云月, 龙海霞, 杨旭华, 毛剑飞.
基于矩阵分解的属性网络嵌入和社区发现算法
Attributed Network Embedding Based on Matrix Factorization and Community Detection
计算机科学, 2021, 48(12): 204-211. https://doi.org/10.11896/jsjkx.210300060
[12] 高创, 李建华, 季秀怡, 朱程龙, 李诗良, 李洪林.
基于图卷积神经网络的药物靶标作用关系预测方法
Drug Target Interaction Prediction Method Based on Graph Convolutional Neural Network
计算机科学, 2021, 48(10): 127-134. https://doi.org/10.11896/jsjkx.200700068
[13] 丁钰, 魏浩, 潘志松, 刘鑫.
网络表示学习算法综述
Survey of Network Representation Learning
计算机科学, 2020, 47(9): 52-59. https://doi.org/10.11896/jsjkx.190300004
[14] 蒋宗礼, 李苗苗, 张津丽.
基于融合元路径图卷积的异质网络表示学习
Graph Convolution of Fusion Meta-path Based Heterogeneous Network Representation Learning
计算机科学, 2020, 47(7): 231-235. https://doi.org/10.11896/jsjkx.190600085
[15] 吴勇, 王斌君, 翟一鸣, 仝鑫.
共引增强有向网络嵌入研究
Study on Co-citation Enhancing Directed Network Embedding
计算机科学, 2020, 47(12): 279-284. https://doi.org/10.11896/jsjkx.191000199
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!