计算机科学 ›› 2022, Vol. 49 ›› Issue (9): 76-82.doi: 10.11896/jsjkx.210900078

• 数据库&大数据&数据科学* 上一篇    下一篇

基于异构网络表征学习的作者学术行为预测

黄丽1, 朱焱1, 李春平2   

  1. 1 西南交通大学计算机与人工智能学院 成都 611756
    2 清华大学软件学院 北京 100091
  • 收稿日期:2021-09-10 修回日期:2022-01-25 出版日期:2022-09-15 发布日期:2022-09-09
  • 通讯作者: 朱焱(yzhu@swjtu.edu.cn)
  • 作者简介:(793275643@qq.com)
  • 基金资助:
    四川省科技计划项目(2019YFSY0032)

Author’s Academic Behavior Prediction Based on Heterogeneous Network Representation Learning

HUANG Li1, ZHU Yan1, LI Chun-ping2   

  1. 1 School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu 611756,China
    2 School of Software,Tsinghua University,Beijing 100091,China
  • Received:2021-09-10 Revised:2022-01-25 Online:2022-09-15 Published:2022-09-09
  • About author:HUANG Li,born in 1996,postgraduate.Her main research interests include representation learning,data mining and link prediction.
    ZHU Yan,born in 1965,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.Her main research interests include data mining,Web anomaly and intelligent analysis.
  • Supported by:
    Sichuan Province Science and Technology Project(2019YFSY0032).

摘要: 作者学术行为预测旨在从异构学术网络中挖掘作者的行为关系,以促进科研合作,产出高水平、高质量的研究成果。现有的节点表示方法大多未考虑节点的语义特征、内容特征、全局结构等,难以有效学习网络中节点的低维特性。为有效融合节点的多维特征和全局结构,提出了一种集成BiLSTM、注意力机制和聚类算法的异构网络表示学习方法HNEMA,以提高学术网络中作者的学术行为预测效果。HNEMA首先基于BiLSTM和注意力机制融合节点的多维特征,聚合同一元路径下或不同元路径下相同类型的邻居,随后聚合待表征节点的所有邻居的多维特征。基于此,采用聚类算法捕获节点的全局特征,从而全面有效地学习节点的低维特性。在全面特征学习的基础上,应用逻辑回归分类器预测作者的学术行为。在3个公开数据集上的验证实验结果表明,相比其他方法,HNEMA在AUC和F1指标上都有一定程度的提升。

关键词: 异构网络, 网络表征学习, 链接预测, 元路径

Abstract: The author's academic behavior prediction aims to mine the behavioral relationships of authors from heterogeneous academic networks to promote scientific research cooperation and produce high-level and high-quality research results.Most of the existing methods of node representation learning do not consider the semantic feature,content feature,global structure of the node,etc.It is difficult to effectively learn the low-dimensional characteristics of the node in the network.In order to effectively integrate the multi-dimensional features and global structure of nodes,a heterogeneous network representation learning method(HNEMA) that integrates BiLSTM,attention mechanism and clustering algorithm is proposed to improve the predictive effect of author's academic behavior.HNEMA first integrates the multi-dimensional features of nodes based on BiLSTM and attention mechanism,aggregates the same type of neighbors on the same meta-path or different meta-paths,and then aggregates the multi-dimensional features of all neighbors of the node to be characterized.Based on this,a clustering algorithm is used to capture the global features of the node,so as to comprehensively and effectively learn the low-dimensional characteristics of the node.On the basis of comprehensive feature learning,logistic regression classifier is used to predict author's academic behavior.Validation experiments on three public datasets show that HNEMA has a certain degree of improvement in AUC and F1 indicators compared to other methods.

Key words: Heterogeneous network, Network representation learning, Link prediction, Meta-path

中图分类号: 

  • TP183
[1]KUMAR A,SINGH S S,SINGH K,et al.Link prediction techniques,applications,and performance:A survey[J/OL].Physica A:Statistical Mechanics and its Applications,2020,553.https://doi.org/10.1016/j.physa.
[2]SHI C,SUN Y.Research progress of heterogeneous networkrepresentation learning[J].Communications of the CCF,2018,14(3):16-20.
[3]DONG Y,CHAWLA N V,SWAMI A.Metapath2vec:Scalable representation learning for heterogeneous networks [C]//Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining.2017:135-144.
[4]WANG X,JI H,SHI C,et al.Heterogeneous graph attention network[C]//The World Wide Web Conference.2019:2022-2032.
[5]ZHANG C,SONG D,HUANG C,et al.Heterogeneous graph neural network [C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mi-ning.2019:793-803.
[6]PEROZZI B,AL-RFOU R,SKIENA S.DeepWalk:online lear-ning of social representations [C]//Proc of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM Press.2014:701-710.
[7]GROVER A,LESKOVEC J.Node2vec:Scalable feature learning for networks [C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM Press.2016:855-864.
[8]WANG X,SHI C,HU B,et al.Heterogeneous information network embedding for recommendation[J].IEEE Transactions on Knowledge and Data Engineering,2018,31(2):357-370.
[9]ZHANG Y,SHI C.Hyperbolic heterogeneous information network embedding[J].Proceedings of the AAAI Conference on Artificial Intelligence,2019,33(1):5337-5344.
[10]HU Z,DONG Y,WANG K.Heterogeneous graph transformer [C]//Proceedings of The Web Conference.2020:2704-2710.
[11]CAO M,MA X,XU M,et al.Heterogeneous information network embedding with meta-path based on graph attention networks [C]//International Conference on Artificial Neural Networks.2019:622-634.
[12]SANKAR A,ZHANG X,CHANG K C.Meta-gnn:metagraph neural network for semi-supervised learning in attributed heterogeneous information networks[C]//Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.2019:137-144.
[13]ZHOU S,BU J,WANG X,et al.Hahe:Hierarchical attentive heterogeneous information network embedding[J].arXiv:1902.01475,2019.
[14]FU X,ZHANG J,MENG Z,et al.Magnn:Metapath aggregated graph neural network for heterogeneous graph embedding[C]//Proceedings of The Web Conference.2020:2331-2341.
[15]HU B,FANG Y,SHI C.Adversarial learning on heterogeneous information networks[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2019:120-129.
[16]LU Y,SHI C,HU L,et al.Relation structure-aware heterogeneous information network embedding[J].Proceedings of the AAAI Conference on Artificial Intelligence,2019,33(1):4456-4463.
[17]HAMILTON W L,YING R,LESKOVEC J.Inductive representation learning on large graphs[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.2017:1025-1035.
[18]LE Q,MIKOLOV T.Distributed representations of sentences and documents[C]//International conference on machine learning.PMLR,2014:1188-1196.
[19]ROZEMBERCZKI B,DAVIES R,SARKAR R,et al.Gemsec:Graph embedding with self clustering [C]//Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.2019:65-72.
[20]ZHANG C,SWAMI A,CHAWLA N V.Shne:Representation learning for semantic-associated heterogeneous networks [C]//Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining.2019:690-698.
[1] 宋杰, 梁美玉, 薛哲, 杜军平, 寇菲菲.
基于无监督集群级的科技论文异质图节点表示学习方法
Scientific Paper Heterogeneous Graph Node Representation Learning Method Based onUnsupervised Clustering Level
计算机科学, 2022, 49(9): 64-69. https://doi.org/10.11896/jsjkx.220500196
[2] 吕晓锋, 赵书良, 高恒达, 武永亮, 张宝奇.
基于异质信息网的短文本特征扩充方法
Short Texts Feautre Enrichment Method Based on Heterogeneous Information Network
计算机科学, 2022, 49(9): 92-100. https://doi.org/10.11896/jsjkx.210700241
[3] 蒲实, 赵卫东.
一种面向动态科研网络的社区检测算法
Community Detection Algorithm for Dynamic Academic Network
计算机科学, 2022, 49(1): 89-94. https://doi.org/10.11896/jsjkx.210100023
[4] 蒋宗礼, 樊珂, 张津丽.
基于生成对抗网络和元路径的异质网络表示学习
Generative Adversarial Network and Meta-path Based Heterogeneous Network Representation Learning
计算机科学, 2022, 49(1): 133-139. https://doi.org/10.11896/jsjkx.201000179
[5] 赵金龙, 赵中英.
基于异质信息网络表示学习与注意力神经网络的推荐算法
Recommendation Algorithm Based on Heterogeneous Information Network Embedding and Attention Neural Network
计算机科学, 2021, 48(8): 72-79. https://doi.org/10.11896/jsjkx.200800226
[6] 陈恒, 王维美, 李冠宇, 史一民.
四元数关系旋转的知识图谱补全模型
Knowledge Graph Completion Model Using Quaternion as Relational Rotation
计算机科学, 2021, 48(5): 225-231. https://doi.org/10.11896/jsjkx.200300093
[7] 胡昕彤, 沙朝锋, 刘艳君.
基于随机投影和主成分分析的网络嵌入后处理算法
Post-processing Network Embedding Algorithm with Random Projection and Principal Component Analysis
计算机科学, 2021, 48(5): 124-129. https://doi.org/10.11896/jsjkx.200500058
[8] 程云飞, 田红心, 刘祖军.
NOMA系统异构网络中联合用户关联和功率控制协同优化
Collaborative Optimization of Joint User Association and Power Control in NOMA Heterogeneous Network
计算机科学, 2021, 48(3): 269-274. https://doi.org/10.11896/jsjkx.191100213
[9] 王文博, 罗恒利.
基于图卷积神经网络的完全图人脸聚类
Complete Graph Face Clustering Based on Graph Convolution Network
计算机科学, 2021, 48(11A): 275-277. https://doi.org/10.11896/jsjkx.201200102
[10] 肖勇, 金鑫, 冯俊豪.
一种适用于电力异构通信的链路速率跨层匹配机制
Cross-layer Matching Mechanism of Link Communication Rate for Heterogeneous Communication in Power System
计算机科学, 2021, 48(11A): 495-499. https://doi.org/10.11896/jsjkx.200500113
[11] 曾德泽, 李跃鹏, 赵宇阳, 顾琳.
基于强化学习的高能效基站动态调度方法
Reinforcement Learning Based Dynamic Basestation Orchestration for High Energy Efficiency
计算机科学, 2021, 48(11): 363-371. https://doi.org/10.11896/jsjkx.201000008
[12] 蒋宗礼, 李苗苗, 张津丽.
基于融合元路径图卷积的异质网络表示学习
Graph Convolution of Fusion Meta-path Based Heterogeneous Network Representation Learning
计算机科学, 2020, 47(7): 231-235. https://doi.org/10.11896/jsjkx.190600085
[13] 刘苗苗,扈庆翠,郭景峰,陈晶.
符号网络链接预测算法研究综述
Survey of Link Prediction Algorithms in Signed Networks
计算机科学, 2020, 47(2): 21-30. https://doi.org/10.11896/jsjkx.190600104
[14] 李忠文, 丁烨, 花忠云, 李君一, 廖清.
结合三元组重要性的知识图谱补全模型
Knowledge Graph Completion Model Based on Triplet Importance Integration
计算机科学, 2020, 47(11): 231-236. https://doi.org/10.11896/jsjkx.200800195
[15] 陈晓军, 向阳.
STransH:一种改进的基于翻译模型的知识表示模型
STransH:A Revised Translation-based Model for Knowledge Representation
计算机科学, 2019, 46(9): 184-189. https://doi.org/10.11896/j.issn.1002-137X.2019.09.026
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!