计算机科学 ›› 2024, Vol. 51 ›› Issue (4): 132-150.doi: 10.11896/jsjkx.230200084

• 数据库&大数据&数据科学 • 上一篇    下一篇

图神经网络节点分类任务基准测试及分析

张陶1,2, 廖彬3, 于炯2, 李敏2,4, 孙瑞娜4   

  1. 1 贵州中医药大学信息工程学院 贵阳550025
    2 新疆大学信息科学与工程学院 乌鲁木齐830008
    3 贵州财经大学大数据统计学院 贵阳550025
    4 新疆财经大学统计与信息学院 乌鲁木齐830012
  • 收稿日期:2023-02-12 修回日期:2023-07-02 出版日期:2024-04-15 发布日期:2024-04-10
  • 通讯作者: 廖彬(liaobin665@163.com)
  • 作者简介:(zt59921661@126.com)
  • 基金资助:
    国家自然科学基金(61562078);新疆天山青年计划项目(2018Q073)

Benchmarking and Analysis for Graph Neural Network Node Classification Task

ZHANG Tao1,2, LIAO Bin3, YU Jiong2, LI Ming2,4, SUN Ruina4   

  1. 1 College of Information Engineering,Guizhou University of Traditional Chinese Medicine,Guiyang 550025,China
    2 School of Information Science and Engineering,Xinjiang University,Urumqi 830008,China
    3 College of Big Data Statistics,Guizhou University of Finance and Economics,Guiyang 550025,China
    4 College of Statistics and Information,Xinjiang University of Finance and Economics,Urumqi 830012,China
  • Received:2023-02-12 Revised:2023-07-02 Online:2024-04-15 Published:2024-04-10
  • Supported by:
    National Natural Science Foundation of China(61562078) and Tianshan Youth Talent Program of Xinjiang Uygur Autonomous Region(2018Q073).

摘要: 图神经网络(Graph Neural Network,GNN)模型由于采用端到端的模型架构,在训练过程中能够更好地将节点隐藏特征的学习和分类目标协同起来,相比图嵌入(Graph Embedding)的方法,其在节点分类等任务上得到了较大的性能提升。但是,已有图神经网络模型实验对比阶段普遍存在的数据集类型单一、样本量不足、数据集切分不规范、对比模型规模及范围有限、评价指标单一、缺乏模型训练耗时对比等问题。为此,文中选取了包括cora,citeseer,pubmed,deezer等在内的来自不同领域(引文网络、社交网络及协作网络等)的共计20种数据集,以准确率、精确率、召回率、F-score值及模型训练耗时为多维评价指标,在FastGCN,PPNP,ChebyNet,DAGNN等17种主流图神经网络模型上,进行了全面且公平的节点分类任务基准测评,进而为真实业务场景下的模型选择提供了决策参考。通过基准测试实验发现,一方面,影响模型训练速度的因素排名依次是节点属性维度、图节点规模及图边的规模;另一方面,并不存在赢者通吃的模型,即不存在在所有数据集下全都表现优异的模型,特别是在公平的基准测试配置环境下,结构简洁的模型反而比复杂的GNN模型有着更好的性能表现。

关键词: 图神经网络, 基准测试, 节点分类, 性能评估, 模型选择

Abstract: In contrast with previous graph embedding algorithms,the graph neural network model performs tasks such as node classification more effectively because it can better coordinate the learning of hidden node features with the classification target due to its end-to-end model architecture in the training process.However,the experimental comparison stage of existing graph neural models frequently suffers from problems such as specific types of experimental datasets,insufficient dataset sample size,irregular splitting of the train and test sets,limited scale and scope of comparison models,homogeneous performance evaluation metrics,and lack of comparative analysis for model's training time consumption.To this end,in order to provide decision guidelines for GNN model selection in real business scenarios,a total of 20 datasets from various domains(citation networks,social networks,collaboration networks,etc.),including cora,citeseer,pubmed,deezer,etc.,are chosen to conduct a comprehensive and equitable benchmark evaluation of node classification tasks on 17 mainstream graph neural network models,including FastGCN,PPNP,ChebyNet,DAGNN,etc.,on performance evaluation metrics including accuracy,precision,recall,F-score value,and model training time.The benchmarking experiments revealed that,on the one hand,the factors that affect the speed of model training are node attribute dimension,graph node size and graph edge size in turn;on the other hand,there is no winner-take-all model,that is,there is no model that performs well across all benchmark datasets,especially in a fair benchmarking configuration,the model with simple structure has better performance than the complex GNN models.

Key words: Graph neural network, Benchmarking, Node classification, Performance evaluation, Model selection

中图分类号: 

  • TP391
[1]ALEX K,ILYA S,GEOFFREY E H.ImageNet classificationwith deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90.
[2]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2015.
[3]TANG H F,DONG Y F,ZHANG Y T,et al.Survey of Image Inpainting Algorithms Based on Deep Learning[J].Computer Science,2020,47(11A):151-164.
[4]ZHAO P,ZHANG S,LIU J,et al.Zero-shot Learning via thefusion of generation and embedding for image recognition[J].Information Sciences,2021,578:831-847.
[5]LIU X,WANG L,HAN X.Transformer with peak suppression and knowledge guidance for fine-grained image recognition[J].Neurocomputing,2022,492:137-149.
[6]WU R,LI S,CHEN C,et al.Improving video anomaly detection performance by mining useful data from unseen video frames[J].Neurocomputing,2021,462:523-533.
[7]ULLAH W,ULLAH A,HUSSAIN T,et al.Artificial Intelli-gence of Things-assisted two-stream neural network for anomaly detection in surveillance Big Video Data[J].Future Generation Computer Systems,2022,129:286-297.
[8]DOSHI K,YILMAZ Y.Online anomaly detection in surveillance videos with asymptotic bound on false alarm rate[J].Pattern Recognition,2021,114:107865.
[9]SAHRAEIAN R,VAN COMPERNOLLE D.Cross-entropytraining of DNN ensemble acoustic models for low-resource ASR[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2018,26(11):1991-2001.
[10]YI J,TAO J,WEN Z,et al.Language-adversarial transfer lear-ning for low-resource speech recognition[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2018,27(3):621-630.
[11]ZHAO Y,KOMACHI M,KAJIWARA T,et al.Region-atten-tive multimodal neural machine translation[J].Neurocompu-ting,2022,476:1-13.
[12]KUMAR A,PRATAP A,SINGH A K,et al.Addressing domain shift in neural machine translation via reinforcement lear-ning[J].Expert Systems with Applications,2022,201(9):1-18.
[13]SAFI S M,MOVAGHAR A,GHORBANI M.Privacy protection scheme for mobile social network[J].Journal of King Saud University-Computer and Information Sciences,2022,34(7),4062-4074.
[14]SONG D,WANG W,FAN Y,et al.Quantifying the structural and temporal characteristics of negative links in signed citation networks[J].Information Processing & Management,2022,59(4):1-16.
[15]KAJIKAWA Y,MEJIA C,WU M,et al.Academic landscape of Technological Forecasting and Social Change through citation network and topic analyses[J].Technological Forecasting and Social Change,2022,182(9):1-15.
[16]WANG J,XU S,MARIANI M S,et al.The local structure of citation networks uncovers expert-selected milestone papers[J].Journal of Informetrics,2021,15(4):1-17.
[17]JIANG W W.Graph-based deep learning for communication net-works:a survey[J].Computer Communications,2022,185(3):40-54.
[18]GAO H,LI W,CAI H.Distributed control of a flywheel energy storage system subject to unreliable communication network[J].Energy Reports,2022,8:11729-11739.
[19]KNIES A,LORCA J,MELO E.A recursive logit model with choice aversion and its application to transportation networks[J].Transportation Research Part B:Methodological,2022,155(1):47-71.
[20]ZHANG T,YU J,LIAO B,et al.The Construction and Analysis of Pass Network Graph Based on GraphX[J].Journal of Computer Research and Development,2016,53(12):2729-2752.
[21]DOSHI S,CHEPURI S P.A computational approach to drug repurposing using graph neural networks[J].Computers in Biology and Medicine,2022,150(9):1-14.
[22]BONGINI P,BIANCHINI M,SCARSELLI F.Molecular gene-rative graph neural networks for drug discovery[J].Neurocomputing,2021,450:242-252.
[23]SAHA S,HALDER A K,BANDYOPADHYAY S S,et al.Computational modeling of human-nCoV protein-protein interaction network[J].Methods,2022,203(7):488-497.
[24]BADKAS A,DE LANDTSHEER S,SAUTER T.Construction and contextualization approaches for protein-protein interaction networks[J].Computational and Structural Biotechnology Journal,2022,20(6):3280-3290.
[25]CUCCHIARA F,PETRINI I,PASSARO A,et al.Gene network Analysis Defines a Subgroup of Small Cell Lung Cancer patientsWith Short Survival[J].Clinical Lung Cancer,2022,23(6):510-521.
[26]GU J,WANG Z,KUEN J,et al.Recent advances in convolu-tional neural networks[J].Pattern Recognition,2018,77:354-377.
[27]LIU J W,SONG Z Y.Overview of recurrent neural networks[J].Control and Decision,2022,37(11):2753-2768.
[28]WU B,LIANG X,ZHANG S S,et al.Advances and applications in graph neural network[J].Chinese Journal of Computers,2022,45(1):35-68.
[29]MA S,LIU J W,ZUO X.Survey on graph neural network[J].Journal of Computer Research and Development,2022,59(1):47-80.
[30]NING Y X,XIE H,JIANG H W.Survey of Graph Neural Network in Community Detection[J].Computer Science,2021,48(11A):11-16.
[31]WANG S,PAN Y,ZHANG J,et al.Robust and label efficient bi-filtering graph convolutional networks for node classification[J].Knowledge-Based Systems,2021,224(6):1-18.
[32]ZHOU C,CHEN H,ZHANG J,et al.Multi-label graph node classification with label attentive neighborhood convolution[J].Expert Systems with Applications,2021,180(10):1-17.
[33]GUILLAUME S,JOHANNES F L,GEORGE D,et al.Modularity-aware graph autoencoders for joint community detection and link prediction[J].Neural Networks,2022,153(9):474-495.
[34]NAYYERI M,CIL G M,VAHDATI S,et al.Trans4E:Linkprediction on scholarly knowledge graphs[J].Neurocomputing,2021,461:530-542.
[35]WANG Y,WANG H,JIN H,et al.Exploring graph capsual network for graph classification[J].Information Sciences,2021,581(12):932-950.
[36]JU W,LUO X,MA Z,et al.GHNN:Graph Harmonic Neural Networks for semi-supervised graph-level classification[J].Neural Networks,2022,151(7):70-79.
[37]KAZIENKO P,KAJDANOWICZ T.Label-dependent node classification in the network[J].Neurocomputing,2012,75(1):199-209.
[38]PEDROCHE F.A model to classify users of social networksbased on PageRank[J].International Journal of Bifurcation and Chaos,2012,22(7):93-106.
[39]WANG S,QIN X,CHI L.HashWalk:An efficient node classification method based on clique-compressed graph embedding[J].Pattern Recognition Letters,2022,156:133-141.
[40]WANG H,YU H,WANG H.EEG_GENet:A feature-levelgraph embedding method for motor imagery classification based on EEG signals[J].Biocybernetics and Biomedical Engineering,2022,42(3):1023-1040.
[41]LUO S,WU X,KAO B.Distributed PageRank computation with improved round complexities[J].Information Sciences,2022,607:109-125.
[42]LI J Y,LI X,LI C.The Kronecker-clique model for higher-order clustering coefficients[J].Physica A:Statistical Mechanics and its Applications,2021,582(11):1-11.
[43]PEROZZI B,AL-RFOU R,SKIENA S.Deepwalk:Online lear-ning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.ACM,2014:701-710.
[44]GROVER A,LESKOVEC J.Node2Vec:Scalable Feature Lear-ning for Networks[C]//Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining.ACM,2016:855-864.
[45]CHEN J,MA T,XIAO C.FastGCN:Fast learning with graph convolutional networks via importance sampling[C]//Procee-dings of the 7th International Conference on Learning Representations.IEEE,2018.
[46]KLICPERA J,BOJCHEVSKI A,GÜNNEMANN S.Predictthen propagate:graph neural networks meet personalized PageRank[C]//Proceedings of the 8th International Conference on Learning Representations.IEEE,2019.
[47]MICHAËL D,XAVIER B,PIERRE V.Convolutional neuralnetworks on graphs with fast localized spectral filtering[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems(NIPS'16).IEEE,2016:3844-3852.
[48]LIU M,GAO H,JI S.Towards deeper graph neural networks[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.IEEE,2020:338-348.
[49]GORI M,MONFARDINI G,SCARSELLI F.A new model for learning in graph domains[C]// Proceedings of the IEEE International Joint Conference on Neural Networks.IEEE,2005:729-734.
[50]FRANCO S,MARCO G,AH C T,et al.The graph neural network model[J].IEEE Transactions on Neural Networks,2009,20(1):61-80.
[51]KIPF T N,WELLING M.Semi-supervised classification withgraph convolutional networks[C]//Proceedings of the 6th International Conference on Learning Representations.IEEE,2017.
[52]HAMILTON W L,YING R,LESKOVEC J.Inductive representation learning on large graphs[C]//Proceedings of the 31th International Conference on Neural Information Processing Systems.IEEE,2017:1025-1035.
[53]VELIČKOVIĆ P,CUCURULL G,CASANOVA A,et al.Graph attention networks[C]//Proceedings of the 7th International Conference on Learning Representations.IEEE,2018.
[54]GE H,LI Q,MENG S,et al.CPGCN:Collaborative Property-aware Graph Convolutional Networks for Service Recommendation[C]//2022 IEEE International Conference on Services Computing(SCC).IEEE,2022:10-19.
[55]LIU M,GAO H,JI S.Towards deeper graph neural networks[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.ACM,2020:338-348.
[56]LI G,MULLER M,THABET A,et al.DeepGCNs:Can GCNs go as deep as cnns? [C]//Proceedings of the International Conference on Computer Vision.IEEE,2019:9267-9276.
[57]GASTEIGER J,WEIßENBERGER S,GÜNNEMANN S.Diffusion improves graph learning[J].Advances in Neural Information Processing Systems,2019,32:13354-13366.
[58]ENTEZARI N,AL-SAYOURI S A,DARVISHZADEH A,et al.All you need is low(rank) defending against adversarial attacks on graphs[C]//Proceedings of the 13th International Conference on Web Search and Data Mining.IEEE,2020:169-177.
[59]XU B,SHEN H,CAO Q,et al.Graph Wavelet Neural Network[C]// Proceedings of the International Conference on Learning Representations(ICLR).IEEE,2019.
[60]ZHU D,ZHANG Z,CUI P,et al.Robust graph convolutional networks against adversarial attacks[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.IEEE,2019:1399-1407.
[61]DU J,ZHANG S,WU G,et al.Topology adaptive graph convolutional networks[C]//Proceedings of the International Confe-rence on Learning Representations(ICLR).IEEE,2018.
[62]DENG Z,DONG Y,ZHU J.Batch virtual adversarial training for graph convolutional networks[C]//Proceedings of the 36th International Conference on Machine Learning(ICML).IEEE,2019.
[63]WU F,SOUZA A,ZHANG T,et al.Simplifying graph convolutional networks[C]//Proceedings of the International Confe-rence on Machine Learning.IEEE,2019:6861-6871.
[64]SHCHUR O,MUMME M,BOJCHEVSKI A,et al.Pitfalls ofgraph neural network evaluation[C]//Proceedings of the 32th Conference on Neural Information Processing Systems.IEEE,2018.
[65]ZHAO W,ZHOU D,QIU X,et al.A pipeline for fair comparison of graph neural networks in node classification tasks[J].arXiv:2012.10619,2020.
[66]GOYAL P,HUANG D,GOSWAMI A,et al.Benchmarks for graph embedding evaluation[J].arXiv:1908.06543,2019.
[67]HU W,FEY M,ZITNIKM,et al.Open graph benchmark:Datasets for machine learning on graphs[J].arXiv:2005.00687,2020.
[68]ERRICA F,PODDA M,BACCIU D,et al.A fair comparison of graph neural networks for graph classification[C]//Proceedings of the 8th International Conference on Learning Representations.IEEE,2020.
[69]GHASEMIAN A,HOSSEINMARDI H,CLAUSET A.Evaluating overfit and underfit in models of network community structure[J].IEEE Transactions on Knowledge & Data Engineering,2020,32(9),1722-1735.
[70]WU Y,WANG Y,WANG X,et al.Motif-Based hypergraph convolution network for semi-supervised node classification on heterogeneous graph[J].Chinese Journal of Computers,2021,44(11):2248-2260.
[71]XIE X J,LIANG Y,WANG Z S,et al.Heterogeneous network node classification method based on graph convolution[J].Journal of Computer Research and Development,2022,59(7):1470-1485.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!