Computer Science ›› 2024, Vol. 51 ›› Issue (1): 143-149.doi: 10.11896/jsjkx.230600079

• Database & Big Data & Data Science • Previous Articles     Next Articles

Pre-training of Heterogeneous Graph Neural Networks for Multi-label Document Classification

WU Jiawei1, FANG Quan2, HU Jun2, QIAN Shengsheng2   

  1. 1 Henan Institute of Advanced Technology,Zhengzhou University,Zhengzhou 450002,China
    2 National Laboratory of Pattern Recognition,Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China
  • Received:2023-06-08 Revised:2023-10-09 Online:2024-01-15 Published:2024-01-12
  • About author:WU Jiawei,born in 1998,postgraduate.His main research interests include multi-label classification,graph neural network and knowledge graph.
    FANG Quan,born in 1988,associate professor.His main research interest is multimedia knowledge computing.
  • Supported by:
    National Natural Science Foundation of China(62072456,62036012,62106262).

Abstract: Multi-label document classification aims to associate document instances with relevant labels,which has received increasing research attention in recent years.Existing multi-label document classification methods attempt to explore the fusion of information beyond the text,such as document metadata or label structure.However,these methods either simply use the semantic information of metadata or do not consider the long-tail distribution of labels,thereby ignoring higher-order relationships between documents and their metadata and the distribution pattern of labels,which affects the accuracy of multi-label document classification.Therefore,this paper proposes a new multi-label document classification method based on the pre-training of hete-rogeneous graph neural networks.The method constructs a heterogeneous graph based on documents and their metadata,adopts two contrastive pre-training methods to capture the relationship between documents and their metadata,and improves the accuracy of multi-label document classification by balancing the problem of long-tail distribution of labels through a loss function.Experimental results on the benchmark dataset show that the proposed method outperforms Transformer BertXML and MATCH by 8%,4.75%,1.3%,respectively.

Key words: Multi-label document classification, Metadata, Heterogeneous graph neural network, Pre-training, Long-tail distribution

CLC Number: 

  • TP391
[1]DONG Y,MA H,SHEN Z,et al.A century of science:Globa-lization of scientific collaborations,citations,and innovations[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2017:1437-1446.
[2]WANG K,SHEN Z,HUANG C,et al.Microsoft academicgraph:When experts are not enough[J].Quantitative Science Studies,2020,1(1):396-413.
[3]MINAEE S,KALCHBRENNER N,CAMBRIA E,et al.Deeplearning-based text classification:a comprehensive review[J].ACM Computing Surveys(CSUR),2021,54(3):1-40.
[4]ZHANG Y,SHEN Z,DONG Y,et al.MATCH:Metadata-aware text classification in a large hierarchy[C]//Proceedings of the Web Conference 2021.2021:3246-3257.
[5]AGGARWAL C C,ZHAI C X.A survey of text classification algorithms[M]//Mining text data.2012:163-222.
[6]HAO C,QIU H P,SUN Y,et al.Research Progress of Multi-label Text Classification[J].Computer Engineering and Applications,2021,57(10):48-56.
[7]LIU J,CHANG W C,WU Y,et al.Deep learning for extreme multi-label text classification[C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.2017:115-124.
[8]YOU R,ZHANG Z,WANG Z,et al.Attentionxml:Label tree-based attention-aware deep model for high-performance extreme multi-label text classification[C]//Advances in Neural Information Processing Systems 32:Annual Conference on Neural Information Processing Systems.2019:5812-5822.
[9]ZHANG W,YAN J,WANG X,et al.Deep extreme multi-label learning[C]//Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval.2018:100-107.
[10]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[11]HUANG Y,GILEDERELI B,KÖKSAL A,et al.Balancingmethods for multi-label text classi-fication with long-tailed class distribution[J].arXiv:2109.04712,2021.
[12]CHANG W C,YU H F,ZHONG K,et al.Taming pretrained transformers for extreme multi-label text classification[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2020:3163-3171.
[13]GONG J,TENG Z,TENG Q,et al.Hierarchical graph transformer-based deep learning model for large-scale multi-label text classification[J].IEEE Access,2020,8:30885-30896.
[14]MA Y L,LIU X F,ZHAO L J,et al.Hybrid embedding-basedtext representation for hierarchical multi-label text classification.[J].Expert Systems with Applications,2022,187:115905.
[15]TANG D,QIN B,LIU T.Learning semantic representations of users and products for document level sentiment classification[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing(volume 1:long papers).2015:1014-1023.
[16]KIM J,AMPLAYO R K,LEE K,et al.Categorical metadatarepresentation for customized text classification[J].Transactions of the Association for Computational Linguistics,2019,7:201-215.
[17]ZHANG Y,SHEN Z,WU C H,et al.Metadata-induced contras-tive learning for zero-shot multi-label text classification[C]//Proceedings of the ACM Web Conference 2022.2022.
[18]YANG P,SUN X,LI W,et al.SGM:sequence generation model for multi-label classification[J].arXiv:1806.04822,2018.
[19]WANG J,CHEN Z,LI H,et al.Hierarchical multi-label classification using incremental hypernetwork[J].Journal of Chongqing University of Posts & Telecommunications(Natural Science Edition),2019,31(4):12.
[20]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Proceedings of the 31st International Confe-rence on Neural Information Processing Systems.2017:6000-6010.
[21]JIANG X,JIA T,FANG Y,et al.Pre-training on large-scaleheterogeneous graph[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining.2021:756-766.
[22]OORD A,LI Y,VINYALS O.Representation learning with contrastive predictive coding[J].arXiv:1807.03748,2018.
[23]BA J L,KIROS J R,HINTON G E.Layer normalization[J].arXiv:1607.06450,2016.
[24]CUI Y,JIA M,LIN T Y,et al.Class-balanced loss based on effective number of samples[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:9268-9277.
[25]WU T,HUANG Q,LIU Z,et al.Distribution-balanced loss for multi-label classification in long-tailed datasets[C]//Computer Vision-ECCV 2020:16th European Conference.Springer International Publishing,2020:162-178.
[26]LU Z Y.PubMed and beyond:a survey of web tools for sear-ching biomedical literature[J/OL].https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3025693/pdf/baq036.pdf.
[27]XUN G,JHA K,YUAN Y,et al.MeSHProbeNet:a self-attentive probe net for MeSH indexing[J].Bioinformatics,2019,35(19):3794-3802.
[28]GUO Q,QIU X,LIU P,et al.Star-transformer[J].arXiv:1902.09113,2019.
[29]XUN G,JHA K,SUN J,et al.Correlation networks for ex-treme multi-label textclassification[C] //Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2020:1074-1082.
[30]PENNINGTON J,SOCHER R,MANNING C D.Glove:Global vectors for word representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing(EMNLP).2014:1532-1543.
[1] TANG Jia, GUO Yan, YE Mingwei, WU Guixing. Multimodal Pre-training Method for Multi-view Contrastive Learning and Semantic Enhancement [J]. Computer Science, 2024, 51(1): 168-174.
[2] YI Liu, GENG Xinyu, BAI Jing. Hierarchical Multi-label Text Classification Algorithm Based on Parallel Convolutional Network Information Fusion [J]. Computer Science, 2023, 50(9): 278-286.
[3] CAI Haoran, YANG Jian, YANG Lin, LIU Cong. Low-resource Thai Speech Synthesis Based on Alternate Training and Pre-training [J]. Computer Science, 2023, 50(6A): 220800127-5.
[4] WANG Taiyan, PAN Zulie, YU Lu, SONG Jingbin. Binary Code Similarity Detection Method Based on Pre-training Assembly Instruction Representation [J]. Computer Science, 2023, 50(4): 288-297.
[5] LIU Zhe, YIN Chengfeng, LI Tianrui. Chinese Spelling Check Based on BERT and Multi-feature Fusion Embedding [J]. Computer Science, 2023, 50(3): 282-290.
[6] SU Qi, WANG Hongling, WANG Zhongqing. Unsupervised Script Summarization Based on Pre-trained Model [J]. Computer Science, 2023, 50(2): 310-316.
[7] HE Wenhao, WU Chunjiang, ZHOU Shijie, HE Chaoxin. Study on Short Text Clustering with Unsupervised SimCSE [J]. Computer Science, 2023, 50(11): 71-76.
[8] LI Haochen, CAO Fuyuan, QIAO Shichang. Unbiased Scene Graph Generation Based on Adaptive Regularization Algorithm [J]. Computer Science, 2023, 50(10): 104-111.
[9] PU Jinyao, BU Lingmei, LU Yongmei, YE Ziming, CHEN Li, YU Zhonghua. Utilizing Heterogeneous Graph Neural Network to Extract Emotion-Cause Pairs Effectively [J]. Computer Science, 2023, 50(1): 205-212.
[10] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[11] CHANG Bing-guo, SHI Hua-long, CHANG Yu-xin. Multi Model Algorithm for Intelligent Diagnosis of Melanoma Based on Deep Learning [J]. Computer Science, 2022, 49(6A): 22-26.
[12] ZHAO Dan-dan, HUANG De-gen, MENG Jia-na, DONG Yu, ZHANG Pan. Chinese Entity Relations Classification Based on BERT-GRU-ATT [J]. Computer Science, 2022, 49(6): 319-325.
[13] LIU Shuo, WANG Geng-run, PENG Jian-hua, LI Ke. Chinese Short Text Classification Algorithm Based on Hybrid Features of Characters and Words [J]. Computer Science, 2022, 49(4): 282-287.
[14] HUANG Yu-jiao, ZHAN Li-chao, FAN Xing-gang, XIAO Jie, LONG Hai-xia. Text Classification Based on Knowledge Distillation Model ELECTRA-base-BiLSTM [J]. Computer Science, 2022, 49(11A): 211200181-6.
[15] Abudukelimu ABULIZI, ZHANG Yu-ning, Alimujiang YASEN, GUO Wen-qiang, Abudukelimu HALIDANMU. Survey of Research on Extended Models of Pre-trained Language Models [J]. Computer Science, 2022, 49(11A): 210800125-12.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!