计算机科学 ›› 2022, Vol. 49 ›› Issue (9): 64-69.doi: 10.11896/jsjkx.220500196

• 数据库&大数据&数据科学* 上一篇    下一篇

基于无监督集群级的科技论文异质图节点表示学习方法

宋杰, 梁美玉, 薛哲, 杜军平, 寇菲菲   

  1. 北京邮电大学计算机学院(国家示范性软件学院)智能通信软件与多媒体北京市重点实验室 北京 100876
  • 收稿日期:2022-05-20 修回日期:2022-07-05 出版日期:2022-09-15 发布日期:2022-09-09
  • 通讯作者: 梁美玉(meiyu1210@bupt.edu.cn)
  • 作者简介:(songs@bupt.edu.cn)
  • 基金资助:
    国家重点研发计划(2018YFB1402600);国家自然科学基金(61877006,61802028,62002027)

Scientific Paper Heterogeneous Graph Node Representation Learning Method Based onUnsupervised Clustering Level

SONG Jie, LIANG Mei-yu, XUE Zhe, DU Jun-ping, KOU Fei-fei   

  1. Beijing Key Laboratory of Intelligent Communication Software and Multimedia,School of Computer Science(National Pilot Software Engineering School),Beijing University of Posts and Telecommunications,Beijing 100876,China
  • Received:2022-05-20 Revised:2022-07-05 Online:2022-09-15 Published:2022-09-09
  • About author:SONG Jie,born in 1997,master.His main research interests include data mining,information retrieval and machine learning.
    LIANG Mei-yu,born in 1985,associate professor,Ph.D.Her main research interests include artificial intelligence,data mining,multimedia information processing and computer vision.
  • Supported by:
    National Key R & D Program of China(2018YFB1402600) and National Natural Science Foundation of China(61877006,61802028,62002027).

摘要: 科技论文数据的知识表征是一个有待解决的问题,而如何学习科技论文异质网络中论文节点的表示是解决这一问题的核心。文中提出了一种基于无监督集群级的科技论文异质图节点表示学习方法(Unsupervised Cluster-level Scientific Paper Heterogeneous Graph Node Representation Learning Method,UCHL),以获取科技论文异质图中节点(作者、机构与论文等)的表示。基于科技论文异质图表示对整个异质图进行链接预测,获取节点之间边的关系,即论文与论文之间的关联关系。实验结果表明,在真实的科技论文数据集上,所提方法在多项评测指标上都取得了更优的性能。

关键词: 科技论文, 异质图网络, 图表示学习, 链接预测, 无监督学习

Abstract: Knowledge representation of scientific paper data is a problem to be solved,and how to learn the representation of paper nodes in scientific paper heterogeneous network is the core to solve this problem.This paper proposes an unsupervised cluster-level scientific paper heterogeneous graph node representation learning method(UCHL),aiming at obtaining the representation of nodes (authors,institutions,papers,etc.) in the heterogeneous graph of scientific papers.Based on the heterogeneous graph representation,this paper performs link prediction on the entire heterogeneous graph and obtains the relationship between the edges of the nodes,that is,the relationship between paper and paper.Experiments results show that the proposed method achieves excellent performance on multiple evaluation metrics on real scientific paper datasets.

Key words: Scientific paper, Heterogeneous graph network, Graph representation learning, Link prediction, Unsupervised learning

中图分类号: 

  • TP391
[1]PEROZZI B,AL-RFOU R,SKIENA S.Deepwalk:Online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2014:701-710.
[2]GROVER A,LESKOVEC J.node2vec:Scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2016:855-864.
[3]NARAYANAN A,CHANDRAMOHAN M,VENKATESANR,et al.graph2vec:Learning distributed representations of graphs[J].arXiv:1707.05005,2017.
[4]WANG D,CUI P,ZHU W.Structural deep network embedding[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2016:1225-1234.
[5]TANG J,QU M,WANG M,et al.Line:Large-scale information network embedding[C]//Proceedings of the 24th International Conference on World Wide Web.2015:1067-1077.
[6]CHEN H,PEROZZI B,HU Y,et al.Harp:Hierarchical representation learning for networks[J].arXiv:1706.07845,2017.
[7]MENG D Y,JIA Y M,DU J P,et al.Data-driven control for rela-tive degree systems via iterative learning[J].IEEE Transactions on Neural Networks,2011,22(12):2213-2225.
[8]KIPF T N,WELLING M.Semi-supervised classification withgraph convolutional networks[J].arXiv:1609.02907,2016.
[9]VELIČKOVIĆ P,CUCURULL G,CASANOVA A,et al.Graph attention networks[J].arXiv:1710.10903,2017.
[10]HAMILTON W L,YING R,LESKOVEC J.Inductive representation learning on large graphs[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.2017:1025-1035.
[11]WANG H,LESKOVEC J.Unifying graph convolutional neural networks and label propagation[J].arXiv:2002.06755,2020.
[12]ZHANG F,BU T M.CN-Motifs Perceptive Graph Neural Networks[J].IEEE Access,2021,9:151285-151293.
[13]FANG Y K,DENG W H,DU J P,et al.Identity-aware Cy-cleGAN for face photo-sketch synthesis and recognition[J].Pattern Recognition,2020,102:1-36.
[14]LUAN S,HUA C,LU Q,et al.Is Heterophily A Real Nightmare For Graph Neural Networks To Do Node Classification?[J].arXiv:2109.05641,2021.
[15]DONG Y,CHAWLA N V,SWAMI A.metapath2vec:Scalable representation learning for heterogeneousnetworks[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2017:135-144.
[16]SHANG J,QU M,LIU J,et al.Meta-path guided embedding for similarity search in large-scale heterogeneous information networks[J].arXiv:1610.09769,2016.
[17]XUE Z,DU J P,DU D W,et al.Deep low-rank subspace ensemble for multi-view clustering[J].Information Sciences,2019,482:210-227.
[18]WANG X,JI H,SHI C,et al.Heterogeneous graph attention network[C]//The World Wide Web Conference.2019:2022-2032.
[19]FU X,ZHANG J,MENG Z,et al.Magnn:Metapath aggregated graph neural network for heterogeneous graph embedding[C]//Proceedings of the Web Conference.2020:2331-2341.
[20]ZHANG C,SONG D,HUANG C,et al.Heterogeneous graph neural network[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2019:793-803.
[21]HU W M,GAO J,LI B,et al.Anomaly detection using localkernel density estimation and context-based regression[J].IEEE Transactions on Knowledge and Data Engineering,2018,32(2):218-233.
[22]HONG H,GUO H,LIN Y,et al.An attention-based graph neu-ral network for heterogeneous structural learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020,34(4):4132-4139.
[23]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[J].arXiv:1706.03762,2017.
[24]HU Z,DONG Y,WANG K,et al.Heterogeneous graph transformer[C]//Proceedings of the Web Conference.2020:2704-2710.
[25]VELICKOVIC P,FEDUS W,HAMILTON W L,et al.DeepGraph Infomax[J].ICLR (Poster),2019,2(3):4.
[26]HJELM R D,FEDOROV A,LAVOIE-MARCHILDON S,et al.Learning deep representations by mutual information estimation and maximization[J].arXiv:1808.06670,2018.
[27]LI W L,JIA Y M,DU J P.Variance-constrained state estimation for nonlinearly coupled complex networks[J].IEEE Transactions on Cybernetics,2017,48(2):818-824.
[28]REN Y,LIU B,HUANG C,et al.Heterogeneous deep graph infomax[J].arXiv:1911.08538,2019.
[29]MAVROMATIS C,KARYPIS G.Graph InfoClust:Leveraging cluster-level node information for unsupervised graph representation learning[J].arXiv:2009.06946,2020.
[1] 黄丽, 朱焱, 李春平.
基于异构网络表征学习的作者学术行为预测
Author’s Academic Behavior Prediction Based on Heterogeneous Network Representation Learning
计算机科学, 2022, 49(9): 76-82. https://doi.org/10.11896/jsjkx.210900078
[2] 侯宏旭, 孙硕, 乌尼尔.
蒙汉神经机器翻译研究综述
Survey of Mongolian-Chinese Neural Machine Translation
计算机科学, 2022, 49(1): 31-40. https://doi.org/10.11896/jsjkx.210900006
[3] 胡昕彤, 沙朝锋, 刘艳君.
基于随机投影和主成分分析的网络嵌入后处理算法
Post-processing Network Embedding Algorithm with Random Projection and Principal Component Analysis
计算机科学, 2021, 48(5): 124-129. https://doi.org/10.11896/jsjkx.200500058
[4] 陈恒, 王维美, 李冠宇, 史一民.
四元数关系旋转的知识图谱补全模型
Knowledge Graph Completion Model Using Quaternion as Relational Rotation
计算机科学, 2021, 48(5): 225-231. https://doi.org/10.11896/jsjkx.200300093
[5] 王文博, 罗恒利.
基于图卷积神经网络的完全图人脸聚类
Complete Graph Face Clustering Based on Graph Convolution Network
计算机科学, 2021, 48(11A): 275-277. https://doi.org/10.11896/jsjkx.201200102
[6] 付颖, 王红玲, 王中卿.
基于单词-章节关联的科技论文摘要
Scientific Paper Summarization Using Word-Section Association
计算机科学, 2021, 48(10): 59-66. https://doi.org/10.11896/jsjkx.200900180
[7] 李金霞, 赵志刚, 李强, 吕慧显, 李明生.
改进的局部和相似性保持特征选择算法
Improved Locality and Similarity Preserving Feature Selection Algorithm
计算机科学, 2020, 47(6A): 480-484. https://doi.org/10.11896/JsJkx.20190800095
[8] 张志扬, 张凤荔, 陈学勤, 王瑞锦.
基于分层注意力的信息级联预测模型
Information Cascade Prediction Model Based on Hierarchical Attention
计算机科学, 2020, 47(6): 201-209. https://doi.org/10.11896/jsjkx.200200117
[9] 王成章, 白晓明, 杜金栗.
图像的扩散界面无监督聚类算法
Diffuse Interface Based Unsupervised Images Clustering Algorithm
计算机科学, 2020, 47(5): 149-153. https://doi.org/10.11896/jsjkx.190300125
[10] 刘苗苗,扈庆翠,郭景峰,陈晶.
符号网络链接预测算法研究综述
Survey of Link Prediction Algorithms in Signed Networks
计算机科学, 2020, 47(2): 21-30. https://doi.org/10.11896/jsjkx.190600104
[11] 罗月,童卞,景帅,张蒙,饶永明,闫峰.
基于卷积去噪自编码器的芯片表面弱缺陷检测方法
Detection Method of Chip Surface Weak Defect Based on Convolution Denoising Auto-encoders
计算机科学, 2020, 47(2): 118-125. https://doi.org/10.11896/jsjkx.190100141
[12] 李忠文, 丁烨, 花忠云, 李君一, 廖清.
结合三元组重要性的知识图谱补全模型
Knowledge Graph Completion Model Based on Triplet Importance Integration
计算机科学, 2020, 47(11): 231-236. https://doi.org/10.11896/jsjkx.200800195
[13] 任雪婷, 赵涓涓, 强彦, Saad Abdul RAUF, 刘继华.
联合成对学习和图像聚类的无监督肺癌亚型识别
Lung Cancer Subtype Recognition with Unsupervised Learning Combining Paired Learning and Image Clustering
计算机科学, 2020, 47(10): 200-206. https://doi.org/10.11896/jsjkx.190900073
[14] 陈晓军, 向阳.
STransH:一种改进的基于翻译模型的知识表示模型
STransH:A Revised Translation-based Model for Knowledge Representation
计算机科学, 2019, 46(9): 184-189. https://doi.org/10.11896/j.issn.1002-137X.2019.09.026
[15] 陈深进, 薛洋.
基于改进卷积神经网络的短时公交客流预测
Short-term Bus Passenger Flow Prediction Based on Improved Convolutional Neural Network
计算机科学, 2019, 46(5): 175-184. https://doi.org/10.11896/j.issn.1002-137X.2019.05.027
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!