计算机科学 ›› 2025, Vol. 52 ›› Issue (3): 180-187.doi: 10.11896/jsjkx.231200138

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于数据增强的异质图注意力网络

杨应修1, 陈红梅1,2, 周丽华1,2, 肖清1   

  1. 1 云南大学信息学院 昆明 650500
    2 云南大学云南省智能系统与计算重点实验室 昆明 650500
  • 收稿日期:2023-12-19 修回日期:2024-02-02 出版日期:2025-03-15 发布日期:2025-03-07
  • 通讯作者: 陈红梅 (hmchen@ynu.edu.cn)
  • 作者简介:(2902690631@qq.com)
  • 基金资助:
    国家自然科学基金(62266050,62276227);云南省中青年学术和技术带头人后备人才项目(202205AC160033);云南省基础研究计划重点项目(202201AS070015);云南省智能系统与计算重点实验室项目(202405AV340009)

Heterogeneous Graph Attention Network Based on Data Augmentation

YANG Yingxiu1, CHEN Hongmei1,2, ZHOU Lihua1,2 , XIAO Qing1   

  1. 1 School of Information Science and Engineering,Yunnan University,Kunming 650500,China
    2 Yunnan Key Laboratory of Intelligent Systems and Computing,Yunnan University,Kunming 650500,China
  • Received:2023-12-19 Revised:2024-02-02 Online:2025-03-15 Published:2025-03-07
  • About author:YANG Yingxiu,born in 1998,postgra-duate,is a member of CCF(No.R8725G).His main research interests include heterogeneous graph embedding and graph neural networks.
    CHEN Hongmei,born in 1976,Ph.D,associate professor,is a member of CCF(No.49450M).Her main research interests include spatial data mining and location-based social network analysis.
  • Supported by:
    National Natural Science Foundation of China(62266050,62276227),Program for Young and Middle-aged Academic and Technical Reserve Leaders of Yunnan Province(202205AC160033),Yunnan Fundamental Research Projects(202201AS070015) and Program of Yunnan Key Laboratory of Intelligent Systems and Computing(202405AV340009).

摘要: 异质图是由不同类型节点及边构成的图,可建模现实世界中各种类型对象及其关系。异质图嵌入旨在捕捉图中丰富的属性、结构和语义等信息,学习节点嵌入向量,用于节点分类、链接预测等任务,进而实现用户识别、商品推荐等应用。在异质图嵌入方法中,元路径通常被用来获取节点间的高阶结构和语义信息,然而现有方法忽略了元路径实例中不同类型节点或异质图中不同类型邻居节点的差异,导致信息丢失,进而影响节点嵌入质量。针对上述问题,提出基于数据增强的异质图注意力网络(Heterogeneous graph Attention Network based on Data Augmentation,HANDA),以更好地学习节点嵌入向量。首先,提出基于元路径邻居的边增强。该方法基于元路径获取节点的元路径邻居,用节点及其元路径邻居形成的语义边增强异质图。这些增强边不仅蕴含了节点间的高阶结构和语义,还缓解了异质图的稀疏性。其次,提出融入节点类型注意力的节点嵌入。该方法采用多头注意力从多个角度学习不同直接边邻居及增强边邻居的重要性并在注意力中融入节点的类型信息,进而通过消息传递、直接边邻居及增强边邻居同时获取节点的属性、高阶结构和语义信息,提升了节点嵌入质量。在真实数据集上的实验验证了HANDA模型在节点分类、链接预测任务上的效果优于基准模型。

关键词: 异质图, 嵌入, 元路径, 数据增强, 图神经网络

Abstract: Heterogeneous graph is a graph composed of different types of nodes and edges,which can model various types of objects and their relationships in the real world.Heterogeneous graph embedding aims to learn the embedding vectors of nodes by capturing rich attribute,structural and semantic information in the graph,which can be used in tasks such as node classification and link prediction,further achieving applications such as user recognition and product recommendation.The existing embedding methods exploit meta-paths to capture high-order structural and semantic information between nodes.However,the existing methods ignore the differences between different types of nodes in meta-path instances or different types of neighbor nodes in the graph,resulting in information loss,which in turn affects the quality of node embedding.In view of the above issues,this paper proposes heterogeneous graph attention network based on data augmentation(HANDA) to better learn the embedding vectors of nodes.Firstly,edge augmentation method based on meta-path neighbors is proposed.The method obtains neighbors of nodes based on meta-paths,and generates semantic edges between nodes and meta-path neighbors.These edges not only contain high-order structural and semantic information between nodes,but also alleviate the sparsity issue of the graph.Secondly,a node embedding method incorporating node type attention is presented.The method adopts the multi-head attention incorporating node types to obtain the importance of neighbors formed by both edges and semantic edges.Further,the method simultaneously captures attribute,high-order structural and semantic information by message passing and two kinds of neighbors,resulting in improving the embedding vectors of nodes.Experimental results on real datasets show that the proposed HANDA outperforms the baselines in both node classification and link prediction.

Key words: Heterogeneous graph, Embedding, Meta-path, Data augmentation, Graph neural network

中图分类号: 

  • TP391
[1]DONG Y,CHAWLA N V,SWAMI A.metapath2vec:Scalable representation learning for heterogeneous networks[C]//the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2017:135-144.
[2]WANG X,JI H,SHI C,et al.Heterogeneous graph attentionnetwork[C]//The World Wide Web Conference.2019:2022-2032.
[3]FU X,ZHANG J,MENG Z,et al.Magnn:Metapath aggregated graph neural network for heterogeneous graph embedding[C]//Proceedings of The Web Conference 2020.2020:2331-2341.
[4]GOLDBERG Y,LEVY O.word2vec Explained:deriving Mikolov et al.’s negative-sampling word-embedding method[J].arXiv:1402.3722,2014.
[5]ZHANG D,YIN J,ZHU X,et al.Metagraph2vec:Complex semantic path augmented heterogeneous network embedding[C]//Advances in Knowledge Discovery and Data Mining:22nd Pacific-Asia Conference,PAKDD 2018.Melbourne,VIC,Australia,June 3-6,2018,Proceedings,Part II 22:Springer International Publishing,2018:196-208.
[6]HE Y,SONG Y,LI J,et al.Hetespaceywalk:A heterogeneousspacey random walk for heterogeneous information network embedding[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management.2019:639-648.
[7]CHEN H,YIN H,WANG W,et al.PME:projected metric embedding on heterogeneous networks for link prediction[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2018:1177-1186.
[8]GUI H,LIU J,TAO F,et al.Large-scale embedding learning in heterogeneous event data[C]//2016 IEEE 16th International Conference on Data Mining(ICDM).IEEE,2016:907-912.
[9]SHI C,LU Y,HU L,et al.RHINE:Relation structure-awareheterogeneous information network embedding[J].IEEE Transactions on Knowledge and Data Engineering,2020,34(1):433-447.
[10]WANG X,LU Y,SHI C,et al.Dynamic heterogeneous information network embedding with meta-path based proximity[J].IEEE Transactions on Knowledge and Data Engineering,2020,34(3):1117-1132.
[11]AL-FURAS A T,ALRAHMAWY M F,AL-ADROUSY W M,et al.Deep Attributed Network Embedding via Weisfeiler-Lehman and Autoencoder[J].IEEE Access,2022,10:61342-61353.
[12]CHANG S,HAN W,TANG J,et al.Heterogeneous networkembedding via deep architectures[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining,2015:119-128.
[13]YUN S,JEONG M,YOO S,et al.Graph Transformer Net-works:Learning meta-path graphs to improve GNNs[J].Neural Networks,2022,153:104-119.
[14]TU K,CUI P,WANG X,et al.Structural deep embedding for hyper-networks[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018.
[15]CHANG Y,CHEN C,HU W,et al.Megnn:Meta-path extracted graph neural network for heterogeneous graph representation learning[J].Knowledge-Based Systems.2022,235:107611.
[16]LV Q,DING M,LIU Q,et al.Are we really making much progress? revisiting,benchmarking and refining heterogeneous graph neural networks[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining.2021:1150-1160.
[17]CEN Y,ZOU X,ZHANG J,et al.Representation learning forattributed multiplex heterogeneous network[C]//Proceedings of the 25th ACM SIGKDD International Conference on Know-ledge Discovery & Data Mining.2019:1358-1368.
[18]LI G,MÜLLER M,GHANEM B,et al.Training graph neuralnetworks with 1000 layers[C]//International Conference on Machine Learning.PMLR,2021:6437-6449.
[19]SCHLICHTKRULL M,KIPF T N,BLOEM P,et al.Modeling relational data with graph convolutional networks[C]//The Semantic Web:15th International Conference,ESWC 2018,Heraklion,Crete,Greece,June 3-7,2018,Proceedings 15.Springer,2018:593-607.
[20]YUN S,JEONG M,KIM R,et al.Graph transformer networks[J].Advances in Neural Information Processing Systems,2019,32:11983-11993.
[21]YANG C,XIAO Y,ZHANG Y,et al.Heterogeneous network representation learning:A unified framework with survey and benchmark[J].IEEE Transactions on Knowledge and Data Engineering,2020,34(10):4854-4873.
[22]VELIČKOVIC' P,CUCURULL G,CASANOVA A,et al.Graph attention networks[J].arXiv:1710.10903,2017.
[23]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[24]KIPF T N,WELLING M.Semi-supervised classification withgraph convolutional networks[J].arXiv:1609.02907,2016.
[25]JIN D,HUO C,LIANG C,et al.Heterogeneous graph neuralnetwork via attribute completion[C]//Proceedings of the Web Conference 2021.2021:391-400.
[26]HU Z,DONG Y,WANG K,et al.Heterogeneous graph transformer[C]//Proceedings of the Web Conference 2020.2020:2704-2710.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!