Computer Science ›› 2023, Vol. 50 ›› Issue (12): 270-278.doi: 10.11896/jsjkx.230300239

• Artificial Intelligence • Previous Articles     Next Articles

SemFA:Extreme Multi-label Text Classification Model Based on Semantic Features and Association Attention

WANG Zhendong, DONG Kaikun, HUANG Junheng, WANG Bailing   

  1. School of Computer Science and Technology,Harbin Institute of Technology(Weihai),Weihai,Shandong 264209,China
    Research Institute of Cyberspace Security,Harbin Institute of Technology(Weihai),Weihai,Shandong 264209,China
  • Received:2023-03-31 Revised:2023-08-25 Online:2023-12-15 Published:2023-12-07
  • About author:WANG Zhendong,born in 2000,postgraduate,is a member of China Computer Federation.His main research interests include artificial intelligence,natural language processing and financial security.
    WANG Bailing,born in 1978,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include industrial Internet security,information security and financial security.
  • Supported by:
    National Natural Science Foundation of China(62272129),Fundamental Research Funds for the Central Universities of Ministry of Education of China(HIT.NSRIF.2020098) and National Key R&D Program of China(2020YFB2009502).

Abstract: Extreme multi-label text classification(XMTC) is a challenging task that involves finding the most relevant labels from a large and complex label set for a given text sample.Currently,deep learning methods based on the Transformer model have achieved great success in XMTC.However,existing methods have not fully utilized the advantages of the Transformer model,ignoring the subtle local semantic information of texts at different granularities,and failing to establish and utilize the potential associations between labels and texts robustly.To address this issue,this paper proposes SemFA model—an extreme multi-label text classification model based on semantic features and association-attentionthat leverages semantic features and association attention for XMTC.In SemFA,the top-level outputs of multiple encoders are firstly concatenated as global features.Then,a con-volutional neural network is used to extract local features from shallow vectors of multiple encoders.By combining the rich global information and subtle local information at different granularities,more accurate and comprehensive semantic features are obtained.Finally,the potential association is established between label features and text features using an association-attention mechanism,and an association loss is introduced to continuously optimize the model.Experimental results on the Eurlex-4K and Wiki10-31K public datasets show that SemFA outperforms most existing XMTC models,effectively integrating semantic features and association attention to improve overall classification performance.

Key words: Natural Language Processing, Extreme multi-label text classification, Semantic features, Pre-trained models, Attention mechanisms

CLC Number: 

  • TP391
[1]MIRI M,DOWLATSHAHI M B,HASHEMI A,et al.Ensemble feature selection for multi-label text classification:An intelligent order statistics approach[J].International Journal of Intelligent Systems,2022,37(12):11319-11341.
[2]WU H X,HAN M,CHEN Z Q,et al.A review of multi-labelclassification under supervised and semi-supervised learning[J].Computer Science,2022,49(8):12-25.
[3]DEKEL O,SHAMIR O.Multiclass-multilabel classification withmore classes than examples[C]//Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics.JMLR Workshop and Conference Proceedings,2010:137-144.
[4]SUN Y,LV H,LIU X,et al.Personalized recommendation forWeibo comic users[C]//2018 Wireless Telecommunications Symposium.2018:1-6.
[5]LI K Y,CHEN Y,NIU S Z.BERT-based text classification al-gorithm for social e-commerce[J].Computer Science,2021,48(2):87-92.
[6]CHANG W C,YU H F,ZHONG K,et al.Taming pretrainedtransformers for extreme multi-label text classification[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2020:3163-3171.
[7]AGRAWAL R,GUPTA A,PRABHU Y,et al.Multi-label lea-rning with millions of labels:Recommending advertiser bidphrases for web pages[C]//Proceedings of the 22nd International Conference on World Wide Web.2013:13-24.
[8]PRABHU Y,VARMA M.Fastxml:A fast,accurate and stable tree-classifier for extreme multi-label learning[C]//Proceedings of the 20th ACM SIGKDD International Conference on Know-ledge Discovery and Data Mining.2014:263-272.
[9]BABBAR R,SCHOLKOPF B.Dismec:Distributed sparse ma-chines for extreme multi-label classification[C]//Proceedings of the tenth ACM International Conference on Web Search and Data Mining.2017:721-729.
[10]BHATIA K,JAIN H,KAR P,et al.Sparse Local Embeddings for Extreme Multi-label Classification[C]//Annual Conference on Neural Information Processing Systems.2015:730-738.
[11]TAGAMI Y.Annexml:Approximate nearest neighbor searchfor extreme multi-label classification[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2017:455-464.
[12]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008.
[13]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[14]LIU Y,OTT M,GOYAL N,et al.Roberta:A robustly opti-mized bert pretraining approach[J].arXiv:1907.11692,2019.
[15]YANG Z,DAI Z,YANG Y,et al.Xlnet:Generalized autoregressive pretraining for language understanding[C]//Advances in Neural Information Processing Systems.2019:5753-5763.
[16]CHANG W C,YU H F,ZHONG K,et al.Taming pretrained transformers for extreme multi-label text classification[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2020:3163-3171.
[17]JIANG T,WANG D,SUN L,et al.Lightxml:Transformer with dynamic negative sampling for high-performance extreme multi-label text classification[C]//Proceedings of the AAAI Confe-rence on Artificial Intelligence.2021:7987-7994.
[18]MITTAL A,DAHIYA K,AGRAWAL S,et al.Decaf:Deep extreme classification with label features[C]//Proceedings of the 14th ACM International Conference on WebSearch and Data Mining.2021:49-57.
[19]YEH C K,WU W C,KO W J,et al.Learning deep latent space for multi-label classification[J].arXiv:1707.00418,2017.
[20]ZHANG W,YAN J,WANG X,et al.Deep extreme multi-label learning[C]//Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval.2018:100-107.
[21]HUANG X,CHEN B,XIAO L,et al.Label-aware documentrepresentation via hybrid attention for extreme multi-label text classification[J].Neural Processing Letters,2021,54(5):3601-3617.
[22]MA Q,YUAN C,ZHOU W,et al.Label-specific dual graph neural network for multi-label text classification[C]//Procee-dings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confe-rence on Natural Language Processing.2021:3855-3864.
[23]LOZAMENCIA E,FURNKRANZ J.Efficient pairwise multilabelclassification for large-scale problems in the legal domain[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases.2008:50-65.
[24]ZUBIAGA A.Enhancing navigation on wikipedia with socialtags[J].arXiv:1202.5469,2012.
[25]YEN I E H,HUANG X,RAVIKUMAR P,et al.Pd-sparse:A primal and dual sparse approach to extreme multiclass and multilabel classification[C]//International Conference on MachineLearning.2016:3069-3077.
[26]PRABHU Y,KAG A,HARSOLA S,et al.Parabel:Partitioned label trees for extreme classification with application to dynamic search advertising[C]//Proceedings of the 2018 World Wide Web Conference.2018:993-1002.
[27]LIU J,CHANG W C,WU Y,et al.Deep learning for extreme multi-label text classification[C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.2017:115-124.
[28]YOU R,ZHANG Z,WANG Z,et al.Attentionxml:Label tree-based attention-aware deep model for high-performance extreme multi-label text classification[C]//Advances in Neural Information Processing Systems.2019:5571-6362.
[29]WANG B,CHEN L,SUN W,et al.Ranking-based autoencoder for extreme multi-label classification[J].arXiv:1904.05937,2019.
[30]DU C,CHEN Z,FENG F,et al.Explicit interaction model towards text classification[C]//Proceedings of the AAAI Confe-rence on Artificial Intelligence.2019:6359-6366.
[31]ZONG D M,SUN S L.BGNN-XML:Bilateral Graph NeuralNetworks for Extreme Multi-label Text Classification[J].IEEE Transactions on Knowledge and Data Engineering,2022,35(7):6698-6709.
[32]WANG Q,SHU H,ZHU J.GUDN a novel guide network for extreme multi-label text classification[J].arXiv:2201.11582,2022.
[33]SHEN J,QIU W,MENG Y,et al.TaxoClass:Hierarchicalmulti-label text classification using only class names[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics.2021:4239-4249.
[34]ZHANG Q W,ZHANG X,YAN Z,et al.Correlation-Guided Representation for Multi-Label Text Classification[C]//IJCAI.2021:3363-3369.
[35]XIONG J,YU L,NIU X,et al.XRR:Extreme multi-label text classification with candidate retrieving and deep ranking[J].Information Sciences,2023,622:115-132.
[36]ZHANG J,CHANG W C,YU H F,et al.Fast multi-resolution transformer fine-tuning for extreme multi-label text classification[C]//Advances in Neural Information Processing Systems.2021:7267-7280.
[37]KHANDAGALE S,XIAO H,BABBAR R.Bonsai:diverse and shallow trees for extreme multi-label classification[J].Machine Learning,2020,109(11):2099-2119.
[38]ZHANG R,WANG Y S,YANG Y,et al.Long-tailed Extreme Multi-label Text Classification with Generated Pseudo Label Descriptions[J].arXiv:2204.00958,2022.
[39]HAN S,CHOI E,LIM C,et al.Long-tail Mixup for ExtremeMulti-label Classification[C]//Proceedings of the 31st ACM International Conference on Information & Knowledge Management.2022:3998-4002.
[1] ZHANG Yian, YANG Ying, REN Gang, WANG Gang. Study on Multimodal Online Reviews Helpfulness Prediction Based on Attention Mechanism [J]. Computer Science, 2023, 50(8): 37-44.
[2] ZHOU Ziyi, XIONG Hailing. Image Captioning Optimization Strategy Based on Deep Learning [J]. Computer Science, 2023, 50(8): 99-110.
[3] LUO Huilan, LONG Jun, LIANG Miaomiao. Attentional Feature Fusion Approach for Siamese Network Based Object Tracking [J]. Computer Science, 2023, 50(6A): 220300237-9.
[4] WEI Tao, LI Zhihua, WANG Changjie, CHENG Shunhang. Cybersecurity Threat Intelligence Mining Algorithm for Open Source Heterogeneous Data [J]. Computer Science, 2023, 50(6): 330-337.
[5] WANG Lin, MENG Zuqiang, YANG Lina. Chinese Sentiment Analysis Based on CNN-BiLSTM Model of Multi-level and Multi-scale Feature Extraction [J]. Computer Science, 2023, 50(5): 248-254.
[6] ZHEN Tiange, SONG Mingyang, JING Liping. Incorporating Multi-granularity Extractive Features for Keyphrase Generation [J]. Computer Science, 2023, 50(4): 181-187.
[7] CHEN Shifei, LIU Dong, JIANG He. CodeBERT-based Language Model for Design Patterns [J]. Computer Science, 2023, 50(12): 75-81.
[8] QIN Mingfei, FU Guohong. Multi-level Semantic Structure Enhanced Emotional Cause Span Extraction in Conversations [J]. Computer Science, 2023, 50(12): 236-245.
[9] FAN Dongxu, GUO Yi. Aspect-based Multimodal Sentiment Analysis Based on Trusted Fine-grained Alignment [J]. Computer Science, 2023, 50(12): 246-254.
[10] ZHANG Longji, ZHAO Hui. Aspect-level Sentiment Analysis Integrating Syntactic Distance and Aspect-attention [J]. Computer Science, 2023, 50(12): 262-269.
[11] HE Wenhao, WU Chunjiang, ZHOU Shijie, HE Chaoxin. Study on Short Text Clustering with Unsupervised SimCSE [J]. Computer Science, 2023, 50(11): 71-76.
[12] KANG Mengyao, LIU Yang, HUANG Junheng, WANG Bailing, LIU Shulong. Chat Dialogue Summary Model Based on Multi-granularity Contrastive Learning [J]. Computer Science, 2023, 50(11): 192-200.
[13] WANG Lin, LIU Zhe, SHI Dianxi, ZHOU Chenlei, YANG Shaowu, ZHANG Yongjun. Fusion Tracker:Single-object Tracking Framework Fusing Image Features and Event Features [J]. Computer Science, 2023, 50(10): 96-103.
[14] SHAO Wenqiang, CAI Ruijie, SONG Enzhou, GUO Xixi, LIU Shengli. Semantic-based Multi-architecture Binary Function Name Prediction Method [J]. Computer Science, 2023, 50(10): 369-376.
[15] ZHENG Cheng, MEI Liang, ZHAO Yiyan, ZHANG Suhang. Text Classification Method Based on Bidirectional Attention and Gated Graph Convolutional Networks [J]. Computer Science, 2023, 50(1): 221-228.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!