基于异质图神经网络预训练的多标签文档分类研究

doi:10.11896/jsjkx.230600079

Abstract

Abstract: Multi-label document classification aims to associate document instances with relevant labels,which has received increasing research attention in recent years.Existing multi-label document classification methods attempt to explore the fusion of information beyond the text,such as document metadata or label structure.However,these methods either simply use the semantic information of metadata or do not consider the long-tail distribution of labels,thereby ignoring higher-order relationships between documents and their metadata and the distribution pattern of labels,which affects the accuracy of multi-label document classification.Therefore,this paper proposes a new multi-label document classification method based on the pre-training of hete-rogeneous graph neural networks.The method constructs a heterogeneous graph based on documents and their metadata,adopts two contrastive pre-training methods to capture the relationship between documents and their metadata,and improves the accuracy of multi-label document classification by balancing the problem of long-tail distribution of labels through a loss function.Experimental results on the benchmark dataset show that the proposed method outperforms Transformer BertXML and MATCH by 8%,4.75%,1.3%,respectively.

Key words: Multi-label document classification, Metadata, Heterogeneous graph neural network, Pre-training, Long-tail distribution

CLC Number:

TP391

WU Jiawei, FANG Quan, HU Jun, QIAN Shengsheng. Pre-training of Heterogeneous Graph Neural Networks for Multi-label Document Classification[J].Computer Science, 2024, 51(1): 143-149.

References

[1]DONG Y,MA H,SHEN Z,et al.A century of science:Globa-lization of scientific collaborations,citations,and innovations[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2017:1437-1446.
[2]WANG K,SHEN Z,HUANG C,et al.Microsoft academicgraph:When experts are not enough[J].Quantitative Science Studies,2020,1(1):396-413.
[3]MINAEE S,KALCHBRENNER N,CAMBRIA E,et al.Deeplearning-based text classification:a comprehensive review[J].ACM Computing Surveys(CSUR),2021,54(3):1-40.
[4]ZHANG Y,SHEN Z,DONG Y,et al.MATCH:Metadata-aware text classification in a large hierarchy[C]//Proceedings of the Web Conference 2021.2021:3246-3257.
[5]AGGARWAL C C,ZHAI C X.A survey of text classification algorithms[M]//Mining text data.2012:163-222.
[6]HAO C,QIU H P,SUN Y,et al.Research Progress of Multi-label Text Classification[J].Computer Engineering and Applications,2021,57(10):48-56.
[7]LIU J,CHANG W C,WU Y,et al.Deep learning for extreme multi-label text classification[C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.2017:115-124.
[8]YOU R,ZHANG Z,WANG Z,et al.Attentionxml:Label tree-based attention-aware deep model for high-performance extreme multi-label text classification[C]//Advances in Neural Information Processing Systems 32:Annual Conference on Neural Information Processing Systems.2019:5812-5822.
[9]ZHANG W,YAN J,WANG X,et al.Deep extreme multi-label learning[C]//Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval.2018:100-107.
[10]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[11]HUANG Y,GILEDERELI B,KÖKSAL A,et al.Balancingmethods for multi-label text classi-fication with long-tailed class distribution[J].arXiv:2109.04712,2021.
[12]CHANG W C,YU H F,ZHONG K,et al.Taming pretrained transformers for extreme multi-label text classification[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2020:3163-3171.
[13]GONG J,TENG Z,TENG Q,et al.Hierarchical graph transformer-based deep learning model for large-scale multi-label text classification[J].IEEE Access,2020,8:30885-30896.
[14]MA Y L,LIU X F,ZHAO L J,et al.Hybrid embedding-basedtext representation for hierarchical multi-label text classification.[J].Expert Systems with Applications,2022,187:115905.
[15]TANG D,QIN B,LIU T.Learning semantic representations of users and products for document level sentiment classification[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing(volume 1:long papers).2015:1014-1023.
[16]KIM J,AMPLAYO R K,LEE K,et al.Categorical metadatarepresentation for customized text classification[J].Transactions of the Association for Computational Linguistics,2019,7:201-215.
[17]ZHANG Y,SHEN Z,WU C H,et al.Metadata-induced contras-tive learning for zero-shot multi-label text classification[C]//Proceedings of the ACM Web Conference 2022.2022.
[18]YANG P,SUN X,LI W,et al.SGM:sequence generation model for multi-label classification[J].arXiv:1806.04822,2018.
[19]WANG J,CHEN Z,LI H,et al.Hierarchical multi-label classification using incremental hypernetwork[J].Journal of Chongqing University of Posts & Telecommunications(Natural Science Edition),2019,31(4):12.
[20]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Proceedings of the 31st International Confe-rence on Neural Information Processing Systems.2017:6000-6010.
[21]JIANG X,JIA T,FANG Y,et al.Pre-training on large-scaleheterogeneous graph[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining.2021:756-766.
[22]OORD A,LI Y,VINYALS O.Representation learning with contrastive predictive coding[J].arXiv:1807.03748,2018.
[23]BA J L,KIROS J R,HINTON G E.Layer normalization[J].arXiv:1607.06450,2016.
[24]CUI Y,JIA M,LIN T Y,et al.Class-balanced loss based on effective number of samples[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:9268-9277.
[25]WU T,HUANG Q,LIU Z,et al.Distribution-balanced loss for multi-label classification in long-tailed datasets[C]//Computer Vision－ECCV 2020:16th European Conference.Springer International Publishing,2020:162-178.
[26]LU Z Y.PubMed and beyond:a survey of web tools for sear-ching biomedical literature[J/OL].https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3025693/pdf/baq036.pdf.
[27]XUN G,JHA K,YUAN Y,et al.MeSHProbeNet:a self-attentive probe net for MeSH indexing[J].Bioinformatics,2019,35(19):3794-3802.
[28]GUO Q,QIU X,LIU P,et al.Star-transformer[J].arXiv:1902.09113,2019.
[29]XUN G,JHA K,SUN J,et al.Correlation networks for ex-treme multi-label textclassification[C] //Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2020:1074-1082.
[30]PENNINGTON J,SOCHER R,MANNING C D.Glove:Global vectors for word representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing(EMNLP).2014:1532-1543.

Related Articles 15

[1]	WANG Jiabin, LUO Junren, ZHOU Yanzhong, WANG Chao, ZHANG Wanpeng. Survey on Event Extraction Methods:Comparative Analysis of Deep Learning and Pre-training [J]. Computer Science, 2024, 51(9): 196-206.
[2]	GUI Haitao, WANG Zhongqing. Personalized Dialogue Response Generation Combined with Conversation State Information [J]. Computer Science, 2024, 51(6A): 230800055-7.
[3]	DING Yi, WANG Zhongqing. Study on Pre-training Tasks for Multi-document Summarization [J]. Computer Science, 2024, 51(6A): 230300160-8.
[4]	PENG Bo, LI Yaodong, GONG Xianfu, LI Hao. Method for Entity Relation Extraction Based on Heterogeneous Graph Neural Networks and TextSemantic Enhancement [J]. Computer Science, 2024, 51(6A): 230700071-5.
[5]	SHI Jiyun, ZHANG Chi, WANG Yuqiao, LUO Zhaojing, ZHANG Meihui. Generation of Structured Medical Reports Based on Knowledge Assistance [J]. Computer Science, 2024, 51(6): 317-324.
[6]	ZHANG Zhiyuan, ZHANG Weiyan, SONG Yuqiu, RUAN Tong. Multilingual Event Detection Based on Cross-level and Multi-view Features Fusion [J]. Computer Science, 2024, 51(5): 208-215.
[7]	CHEN Wenzhong, CHEN Hongmei, ZHOU Lihua, FANG Yuan. Time-aware Pre-training Method for Sequence Recommendation [J]. Computer Science, 2024, 51(5): 45-53.
[8]	TANG Jia, GUO Yan, YE Mingwei, WU Guixing. Multimodal Pre-training Method for Multi-view Contrastive Learning and Semantic Enhancement [J]. Computer Science, 2024, 51(1): 168-174.
[9]	YI Liu, GENG Xinyu, BAI Jing. Hierarchical Multi-label Text Classification Algorithm Based on Parallel Convolutional Network Information Fusion [J]. Computer Science, 2023, 50(9): 278-286.
[10]	CAI Haoran, YANG Jian, YANG Lin, LIU Cong. Low-resource Thai Speech Synthesis Based on Alternate Training and Pre-training [J]. Computer Science, 2023, 50(6A): 220800127-5.
[11]	WANG Taiyan, PAN Zulie, YU Lu, SONG Jingbin. Binary Code Similarity Detection Method Based on Pre-training Assembly Instruction Representation [J]. Computer Science, 2023, 50(4): 288-297.
[12]	LIU Zhe, YIN Chengfeng, LI Tianrui. Chinese Spelling Check Based on BERT and Multi-feature Fusion Embedding [J]. Computer Science, 2023, 50(3): 282-290.
[13]	SU Qi, WANG Hongling, WANG Zhongqing. Unsupervised Script Summarization Based on Pre-trained Model [J]. Computer Science, 2023, 50(2): 310-316.
[14]	HE Wenhao, WU Chunjiang, ZHOU Shijie, HE Chaoxin. Study on Short Text Clustering with Unsupervised SimCSE [J]. Computer Science, 2023, 50(11): 71-76.
[15]	LI Haochen, CAO Fuyuan, QIAO Shichang. Unbiased Scene Graph Generation Based on Adaptive Regularization Algorithm [J]. Computer Science, 2023, 50(10): 104-111.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Pre-training of Heterogeneous Graph Neural Networks for Multi-label Document Classification

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0