计算机科学 ›› 2024, Vol. 51 ›› Issue (1): 143-149.doi: 10.11896/jsjkx.230600079
吴家伟1, 方全2, 胡骏2, 钱胜胜2
WU Jiawei1, FANG Quan2, HU Jun2, QIAN Shengsheng2
摘要: 多标签文档分类是一种将文档实例与相关标签相关联的技术,近年来受到越来越多研究者的关注。现有的多标签文档分类方法尝试探索文本之外的信息的融合,如文档元数据或标签结构。然而,这些方法要么简单地利用元数据的语义信息,要么没有考虑标签的长尾分布,因此忽略了文档及其元数据之间的高阶关系和标签的分布规律等信息,从而影响到多标签文档分类的准确性。因此,文中提出一种新的基于异质图神经网络预训练的多标签文档分类方法。该方法通过构造文档与其元数据的异质图,采用两种对比学习预训练方法捕获文档与其元数据之间的关系,并通过平衡标签长尾分布的损失函数来提高多标签文档分类的准确性。在基准数据集上的实验结果表明,所提方法的准确率比Transformer提高了8%,比BertXML提高了4.75%,比MATCH提高了1.3%。
中图分类号:
[1]DONG Y,MA H,SHEN Z,et al.A century of science:Globa-lization of scientific collaborations,citations,and innovations[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2017:1437-1446. [2]WANG K,SHEN Z,HUANG C,et al.Microsoft academicgraph:When experts are not enough[J].Quantitative Science Studies,2020,1(1):396-413. [3]MINAEE S,KALCHBRENNER N,CAMBRIA E,et al.Deeplearning-based text classification:a comprehensive review[J].ACM Computing Surveys(CSUR),2021,54(3):1-40. [4]ZHANG Y,SHEN Z,DONG Y,et al.MATCH:Metadata-aware text classification in a large hierarchy[C]//Proceedings of the Web Conference 2021.2021:3246-3257. [5]AGGARWAL C C,ZHAI C X.A survey of text classification algorithms[M]//Mining text data.2012:163-222. [6]HAO C,QIU H P,SUN Y,et al.Research Progress of Multi-label Text Classification[J].Computer Engineering and Applications,2021,57(10):48-56. [7]LIU J,CHANG W C,WU Y,et al.Deep learning for extreme multi-label text classification[C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.2017:115-124. [8]YOU R,ZHANG Z,WANG Z,et al.Attentionxml:Label tree-based attention-aware deep model for high-performance extreme multi-label text classification[C]//Advances in Neural Information Processing Systems 32:Annual Conference on Neural Information Processing Systems.2019:5812-5822. [9]ZHANG W,YAN J,WANG X,et al.Deep extreme multi-label learning[C]//Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval.2018:100-107. [10]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018. [11]HUANG Y,GILEDERELI B,KÖKSAL A,et al.Balancingmethods for multi-label text classi-fication with long-tailed class distribution[J].arXiv:2109.04712,2021. [12]CHANG W C,YU H F,ZHONG K,et al.Taming pretrained transformers for extreme multi-label text classification[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2020:3163-3171. [13]GONG J,TENG Z,TENG Q,et al.Hierarchical graph transformer-based deep learning model for large-scale multi-label text classification[J].IEEE Access,2020,8:30885-30896. [14]MA Y L,LIU X F,ZHAO L J,et al.Hybrid embedding-basedtext representation for hierarchical multi-label text classification.[J].Expert Systems with Applications,2022,187:115905. [15]TANG D,QIN B,LIU T.Learning semantic representations of users and products for document level sentiment classification[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing(volume 1:long papers).2015:1014-1023. [16]KIM J,AMPLAYO R K,LEE K,et al.Categorical metadatarepresentation for customized text classification[J].Transactions of the Association for Computational Linguistics,2019,7:201-215. [17]ZHANG Y,SHEN Z,WU C H,et al.Metadata-induced contras-tive learning for zero-shot multi-label text classification[C]//Proceedings of the ACM Web Conference 2022.2022. [18]YANG P,SUN X,LI W,et al.SGM:sequence generation model for multi-label classification[J].arXiv:1806.04822,2018. [19]WANG J,CHEN Z,LI H,et al.Hierarchical multi-label classification using incremental hypernetwork[J].Journal of Chongqing University of Posts & Telecommunications(Natural Science Edition),2019,31(4):12. [20]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Proceedings of the 31st International Confe-rence on Neural Information Processing Systems.2017:6000-6010. [21]JIANG X,JIA T,FANG Y,et al.Pre-training on large-scaleheterogeneous graph[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining.2021:756-766. [22]OORD A,LI Y,VINYALS O.Representation learning with contrastive predictive coding[J].arXiv:1807.03748,2018. [23]BA J L,KIROS J R,HINTON G E.Layer normalization[J].arXiv:1607.06450,2016. [24]CUI Y,JIA M,LIN T Y,et al.Class-balanced loss based on effective number of samples[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:9268-9277. [25]WU T,HUANG Q,LIU Z,et al.Distribution-balanced loss for multi-label classification in long-tailed datasets[C]//Computer Vision-ECCV 2020:16th European Conference.Springer International Publishing,2020:162-178. [26]LU Z Y.PubMed and beyond:a survey of web tools for sear-ching biomedical literature[J/OL].https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3025693/pdf/baq036.pdf. [27]XUN G,JHA K,YUAN Y,et al.MeSHProbeNet:a self-attentive probe net for MeSH indexing[J].Bioinformatics,2019,35(19):3794-3802. [28]GUO Q,QIU X,LIU P,et al.Star-transformer[J].arXiv:1902.09113,2019. [29]XUN G,JHA K,SUN J,et al.Correlation networks for ex-treme multi-label textclassification[C] //Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2020:1074-1082. [30]PENNINGTON J,SOCHER R,MANNING C D.Glove:Global vectors for word representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing(EMNLP).2014:1532-1543. |
|