计算机科学 ›› 2026, Vol. 53 ›› Issue (2): 312-321.doi: 10.11896/jsjkx.250300038

• 人工智能 • 上一篇    下一篇

基于提示学习与自适应损失加权的汉越产业文本分类

陈霖, 马龙轩, 张勇丙, 黄于欣, 高盛祥, 余正涛   

  1. 昆明理工大学信息工程与自动化学院 昆明 650500
    昆明理工大学云南省人工智能重点实验室 昆明 650500
  • 收稿日期:2025-03-10 修回日期:2025-05-27 发布日期:2026-02-10
  • 通讯作者: 高盛祥(gaoshengxiang.yn@foxmail.com)
  • 作者简介:(chenlin0@stu.kust.edu.cn)
  • 基金资助:
    国家自然科学基金(U23A20388,U21B2027);云南省重点研发计划(202303AP140008,202402AG050007,202302AD080003);云南省基础研究项目(202301AT070393);昆明理工大学“双一流”科技重大专项(202402AG050007)

Industrial Text Classification for Chinese and Vietnamese Based on Prompt Learning and AdaptiveLoss Weighting

CHEN Lin, MA Longxuan, ZHANG Yongbing, HUANG Yuxin, GAO Shengxiang, YU Zhengtao   

  1. Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China
    Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming 650500,China
  • Received:2025-03-10 Revised:2025-05-27 Online:2026-02-10
  • About author:CHEN Lin,born in 2000,postgraduate.His main research interests include na-tural language processing and cross-border industrial big data analysis.
    GAO Shengxiang,born in 1977,Ph.D,professor,is a member of CCF(No.38040M).Her main research interests include natural language processing,information retrieval and machine translation.
  • Supported by:
    National Natural Science Foundation of China(U23A20388,U21B2027),Yunnan Provincial Key Research and Development Program(202303AP140008,202402AG050007,202302AD080003),Yunnan Provincial Basic Research Project(202301AT070393)and Double First-Class Science and Technology Major Project of Kunming University of Science and Technology(202402AG050007).

摘要: 跨境产业文本分类是支撑跨境产业大数据分析的基础任务。随着东南亚地区跨境产业数据的快速增长,对产业数据的分析和处理,特别是对产业文本分类的需求也在日益增加。然而,当前跨境产业文本面临不同语种间的语言差异、语种间数据不均衡以及标注数据稀缺等问题,尤其在低资源语言中更加突出,导致跨境产业数据分类难度加大。针对这一问题,提出了一种基于提示学习和自适应损失加权策略的少样本跨境产业文本分类方法,显著提升了模型在跨境场景中的分类性能。具体而言,该模型基于提示学习框架缓解数据稀缺问题,利用预训练模型的先验知识增强少样本的学习能力;其次,通过构建跨语言文本对,实现语义空间的知识迁移和语义对齐;同时创新性地设计动态混合损失函数,将交叉熵损失、焦点损失和标签平滑损失进行多目标优化,并基于不确定性加权机制动态调整各损失项权重:交叉熵损失保障基础分类能力,焦点损失强化对难分类样本的关注,标签平滑则有效抑制过拟合风险。实验结果表明,所提方法在中文和越南语产业文本分类任务中显著优于现有主流方法,特别是在数据稀缺和语种不平衡的少样本学习场景下,提供了高效的解决方案,为低资源语言的处理提供了新的研究思路。

关键词: 跨境产业文本分类, 少样本学习, 提示学习, 自适应损失加权

Abstract: Cross-border industrial text classification is a fundamental task that supports big data analysis in cross-border industries.With the rapid growth of cross-border industrial data in Southeast Asia,there is an increasing demand for the analysis and processing of industrial data,particularly with respect to industrial text classification.However,cross-border industrial text classification faces several challenges,including linguistic differences across languages,data imbalance among languages,and the scarcity of annotated data.These issues are particularly pronounced in low-resource languages,making cross-border industrial data classification more difficult.To address this issue,this paper proposes a few-shot cross-border industrial text classification method based on prompt learning,combined with an adaptive loss weighting strategy,which significantly enhances the model's classification performance in cross-border scenarios.Specifically,the proposed model mitigates the issue of data scarcity within the prompt-learning framework by leveraging the prior knowledge of pre-trained models to enhance few-shot learning capabilities.Furthermore,cross-lingual text pairs are constructed to facilitate knowledge transfer and semantic alignment in semantic space.Addi-tionally,an innovative dynamic hybrid loss function is designed,integrating cross-entropy loss,focal loss,and label smoothing loss in a multi-objective optimization framework.The loss terms are dynamically weighted based on an uncertainty-based weighting mechanism:cross-entropy loss ensures fundamental classification capability,focal loss enhances the focus on hard-to-classify samples,and label smoothing effectively mitigates the risk of overfitting.Experimental results demonstrate that the proposed method significantly outperforms existing mainstream approaches in cross-border Chinese and Vietnamese industrial text classification tasks,particularly in few-shot learning scenarios with data scarcity and language imbalance.This approach provides an efficient solution and offers new research perspectives for processing low-resource languages.

Key words: Cross-border industrial text classification, Few-shot learning, Prompt learning, Adaptive loss weighting

中图分类号: 

  • TP391
[1]BRAUWERS G,FRASINCAR F.A survey on aspect-based sentiment classification[J].ACM Computing Surveys,2022,55(4):1-37.
[2]MINAEE S,KALCHBRENNER N,CAMBRIA E,et al.Deeplearning-based text classification:a comprehensive review[J].ACM Computing Surveys,2021,54(3):1-40.
[3]GU Y,HAN X,LIU Z,et al.PPT:Pre-trained Prompt Tuning for Few-shot Learning[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:8410-8423.
[4]LU Y,LIU Q,DAI D,et al.Unified Structure Generation forUniversal Information Extraction[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:5755-5772.
[5]JI X.A cultural industry text classification method based onknowledge graph information constraints and knowledge fusion[J].International Journal of Web Engineering and Technology,2024,19(2):127-147.
[6]MENG C,TODO Y,TANG C,et al.MFLSCI:Multi-granularity fusion and label semantic correlation information for multi-label legal text classification[J].Engineering Applications of Artificial Intelligence,2025,139:109604.
[7]ZHANG Y,XU Y,DONG F.An enhanced few-shot text classification approach by integrating topic modeling and prompt-tu-ning[J].Neurocomputing,2025,617:129082.
[8]WINATA G I,MADOTTO A,LIN Z,et al.Language Modelsare Few-shot Multilingual Learners[C]//Proceedings of the 1st Workshop on Multilingual Representation Learning.2021:1-15.
[9]QI K,WAN H,DU J,et al.Enhancing cross-lingual natural language inference by prompt-learning from cross-lingual templates[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:1910-1923.
[10]CHEN Y,HARBECKE D,HENNIG L.Multilingual RelationClassification via Efficient and Effective Prompting[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.2022:1059-1075.
[11]LIU P,YUAN W,FU J,et al.prompt,and predict:A systematic survey of prompting methods in natural language processing[J].ACM Computing Surveys,2023,55(9):1-35.
[12]WANG Y,YAO Q,KWOK J T,et al.Generalizing from a few examples:A survey on few-shot learning[J].ACM Computing Surveys,2020,53(3):1-34.
[13]SALAZAR J,LIANG D,NGUYEN T Q,et al.Masked Lan-guage Model Scoring[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:2699-2712.
[14]BROWN T,MANN B,RYDER N,et al.Language models are few-shot learners[J].Advances in Neural Information Proces-sing Systems,2020,33:1877-1901.
[15]LIU X,ZHENG Y,DU Z,et al.GPT understands,too[J].AI Open,2024,5:208-215.
[16]LI X L,LIANG P.Prefix-Tuning:Optimizing ContinuousPrompts for Generation[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.2021:4582-4597.
[17]CUI L,WU Y,LIU J,et al.Template-Based Named Entity Recognition Using BART[C]//Findings of the Association for Computational Linguistics:ACL-IJCNLP 2021.2021:1835-1845.
[18]HOU Y,CHEN C,LUO X,et al.Inverse is Better!Fast andAccurate Prompt for Few-shot Slot Tagging[C]//Findings of the Association for Computational Linguistics:ACL 2022.2022:637-647.
[19]PETRONI F,ROCKTÄSCHEL T,RIEDEL S,et al.Language Models as Knowledge Bases?[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Proces-sing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).2019:2463-2473.
[20]TALMOR A,ELAZAR Y,GOLDBERG Y,et al.oLMpics-onwhat language model pre-training captures[J].Transactions of the Association for Computational Linguistics,2020,8:743-758.
[21]ZHAO M,SCHÜTZE H.Discrete and Soft Prompting for Multilingual Models[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.2021:8547-8555.
[22]SCHICK T,SCHÜTZE H.Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics.2021:255-269.
[23]SHIN T,RAZEGHI Y,LOGAN IV R L,et al.AutoPrompt:Eliciting Knowledge from Language Models with Automatically Generated Prompts[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing(EMNLP).2020:4222-4235.
[24]LIU J,YANG L.Knowledge-enhanced prompt learning for few-shot text classification[J].Big Data and Cognitive Computing,2024,8(4):43.
[25]ZHOU M,LI X,JIANG Y,et al.Enhancing Cross-lingualPrompting with Dual Prompt Augmentation[C]//Findings of the Association for Computational Linguistics:ACL 2023.2023:11008-11020.
[26]DEMENTIEVA D,KHYLENKO V,GROH G.Cross-lingualText Classification Transfer:The Case of Ukrainian[C]//Proceedings of the 31st International Conference on Computational Linguistics.2025:1451-1464.
[27]SCHICK T,SCHÜTZE H.It’s Not Just Size That Matters:Small Language Models Are Also Few-Shot Learners[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2021:2339-2352.
[28]GAO T,FISCH A,CHEN D.Making Pre-trained LanguageModels Better Few-shot Learners[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.2021:3816-3830.
[29]WANG H,XU C,MCAULEY J.Automatic Multi-LabelPrompting:Simple and Interpretable Few-Shot Classification[C]//Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2022:5483-5492.
[30]HU S,DING N,WANG H,et al.Knowledgeable Prompt-tuning:Incorporating Knowledge into Prompt Verbalizer for Text Classification[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:2225-2240.
[31]HAMBARDZUMYAN K,KHACHATRIAN H,MAY J.WARP:Word-level adversarial reprogramming[C]//ACL-IJCNLP 2021-59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confe-rence on Natural Language Processing,Proceedings of the Confe-rence.2021:4921-4933.
[32]CUI G,HU S,DING N,et al.Prototypical Verbalizer forPrompt-based Few-shot Tuning[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:7014-7024.
[33]SUN X,YANG Y,LIU Y.External Knowledge Enhancing Meta-learning Framework for Few-Shot Text Classification via Contrastive Learning and Adversarial Network[C]//Asia-Paci-fic Web(APWeb) and Web-Age Information Management(WAIM) Joint International Conference on Web and Big Data.Singapore:Springer,2024:46-58.
[34]DONG H,ZHANG W,CHE W.Metricprompt:Prompting model as a relevance metric for few-shot text classification[C]//Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.2023:426-436.
[35]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal Loss for Dense Object Detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,42(2):318-327.
[36]SZEGEDY C,VANHOUCKE V,IOFFE S,et al.Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:2818-2826.
[37]CIPOLLA R,GAL Y,KENDALL A.Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE Computer Society,2018:7482-7491.
[38]PASZKE A,GROSS S,MASSA F,et al.PyTorch:an imperative style,high-performance deep learning library[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.2019:8026-8037.
[39]WOLF T,DEBUT L,SANH V,et al.Transformers:State-of-the-art natural language processing[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing:System Demonstrations.2020:38-45.
[40]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training ofdeep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4171-4186.
[41]LOSHCHILOV I,HUTTER F.Decoupled weight decay regularization[J].arXiv:1711.05101,2017.
[42]ZHANG H,ZHANG X,HUANG H,et al.Prompt-based meta-learning for few-shot text classification[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.2022:1342-1357.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!