计算机科学 ›› 2026, Vol. 53 ›› Issue (2): 312-321.doi: 10.11896/jsjkx.250300038
陈霖, 马龙轩, 张勇丙, 黄于欣, 高盛祥, 余正涛
CHEN Lin, MA Longxuan, ZHANG Yongbing, HUANG Yuxin, GAO Shengxiang, YU Zhengtao
摘要: 跨境产业文本分类是支撑跨境产业大数据分析的基础任务。随着东南亚地区跨境产业数据的快速增长,对产业数据的分析和处理,特别是对产业文本分类的需求也在日益增加。然而,当前跨境产业文本面临不同语种间的语言差异、语种间数据不均衡以及标注数据稀缺等问题,尤其在低资源语言中更加突出,导致跨境产业数据分类难度加大。针对这一问题,提出了一种基于提示学习和自适应损失加权策略的少样本跨境产业文本分类方法,显著提升了模型在跨境场景中的分类性能。具体而言,该模型基于提示学习框架缓解数据稀缺问题,利用预训练模型的先验知识增强少样本的学习能力;其次,通过构建跨语言文本对,实现语义空间的知识迁移和语义对齐;同时创新性地设计动态混合损失函数,将交叉熵损失、焦点损失和标签平滑损失进行多目标优化,并基于不确定性加权机制动态调整各损失项权重:交叉熵损失保障基础分类能力,焦点损失强化对难分类样本的关注,标签平滑则有效抑制过拟合风险。实验结果表明,所提方法在中文和越南语产业文本分类任务中显著优于现有主流方法,特别是在数据稀缺和语种不平衡的少样本学习场景下,提供了高效的解决方案,为低资源语言的处理提供了新的研究思路。
中图分类号:
| [1]BRAUWERS G,FRASINCAR F.A survey on aspect-based sentiment classification[J].ACM Computing Surveys,2022,55(4):1-37. [2]MINAEE S,KALCHBRENNER N,CAMBRIA E,et al.Deeplearning-based text classification:a comprehensive review[J].ACM Computing Surveys,2021,54(3):1-40. [3]GU Y,HAN X,LIU Z,et al.PPT:Pre-trained Prompt Tuning for Few-shot Learning[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:8410-8423. [4]LU Y,LIU Q,DAI D,et al.Unified Structure Generation forUniversal Information Extraction[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:5755-5772. [5]JI X.A cultural industry text classification method based onknowledge graph information constraints and knowledge fusion[J].International Journal of Web Engineering and Technology,2024,19(2):127-147. [6]MENG C,TODO Y,TANG C,et al.MFLSCI:Multi-granularity fusion and label semantic correlation information for multi-label legal text classification[J].Engineering Applications of Artificial Intelligence,2025,139:109604. [7]ZHANG Y,XU Y,DONG F.An enhanced few-shot text classification approach by integrating topic modeling and prompt-tu-ning[J].Neurocomputing,2025,617:129082. [8]WINATA G I,MADOTTO A,LIN Z,et al.Language Modelsare Few-shot Multilingual Learners[C]//Proceedings of the 1st Workshop on Multilingual Representation Learning.2021:1-15. [9]QI K,WAN H,DU J,et al.Enhancing cross-lingual natural language inference by prompt-learning from cross-lingual templates[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:1910-1923. [10]CHEN Y,HARBECKE D,HENNIG L.Multilingual RelationClassification via Efficient and Effective Prompting[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.2022:1059-1075. [11]LIU P,YUAN W,FU J,et al.prompt,and predict:A systematic survey of prompting methods in natural language processing[J].ACM Computing Surveys,2023,55(9):1-35. [12]WANG Y,YAO Q,KWOK J T,et al.Generalizing from a few examples:A survey on few-shot learning[J].ACM Computing Surveys,2020,53(3):1-34. [13]SALAZAR J,LIANG D,NGUYEN T Q,et al.Masked Lan-guage Model Scoring[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:2699-2712. [14]BROWN T,MANN B,RYDER N,et al.Language models are few-shot learners[J].Advances in Neural Information Proces-sing Systems,2020,33:1877-1901. [15]LIU X,ZHENG Y,DU Z,et al.GPT understands,too[J].AI Open,2024,5:208-215. [16]LI X L,LIANG P.Prefix-Tuning:Optimizing ContinuousPrompts for Generation[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.2021:4582-4597. [17]CUI L,WU Y,LIU J,et al.Template-Based Named Entity Recognition Using BART[C]//Findings of the Association for Computational Linguistics:ACL-IJCNLP 2021.2021:1835-1845. [18]HOU Y,CHEN C,LUO X,et al.Inverse is Better!Fast andAccurate Prompt for Few-shot Slot Tagging[C]//Findings of the Association for Computational Linguistics:ACL 2022.2022:637-647. [19]PETRONI F,ROCKTÄSCHEL T,RIEDEL S,et al.Language Models as Knowledge Bases?[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Proces-sing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).2019:2463-2473. [20]TALMOR A,ELAZAR Y,GOLDBERG Y,et al.oLMpics-onwhat language model pre-training captures[J].Transactions of the Association for Computational Linguistics,2020,8:743-758. [21]ZHAO M,SCHÜTZE H.Discrete and Soft Prompting for Multilingual Models[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.2021:8547-8555. [22]SCHICK T,SCHÜTZE H.Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics.2021:255-269. [23]SHIN T,RAZEGHI Y,LOGAN IV R L,et al.AutoPrompt:Eliciting Knowledge from Language Models with Automatically Generated Prompts[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing(EMNLP).2020:4222-4235. [24]LIU J,YANG L.Knowledge-enhanced prompt learning for few-shot text classification[J].Big Data and Cognitive Computing,2024,8(4):43. [25]ZHOU M,LI X,JIANG Y,et al.Enhancing Cross-lingualPrompting with Dual Prompt Augmentation[C]//Findings of the Association for Computational Linguistics:ACL 2023.2023:11008-11020. [26]DEMENTIEVA D,KHYLENKO V,GROH G.Cross-lingualText Classification Transfer:The Case of Ukrainian[C]//Proceedings of the 31st International Conference on Computational Linguistics.2025:1451-1464. [27]SCHICK T,SCHÜTZE H.It’s Not Just Size That Matters:Small Language Models Are Also Few-Shot Learners[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2021:2339-2352. [28]GAO T,FISCH A,CHEN D.Making Pre-trained LanguageModels Better Few-shot Learners[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.2021:3816-3830. [29]WANG H,XU C,MCAULEY J.Automatic Multi-LabelPrompting:Simple and Interpretable Few-Shot Classification[C]//Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2022:5483-5492. [30]HU S,DING N,WANG H,et al.Knowledgeable Prompt-tuning:Incorporating Knowledge into Prompt Verbalizer for Text Classification[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:2225-2240. [31]HAMBARDZUMYAN K,KHACHATRIAN H,MAY J.WARP:Word-level adversarial reprogramming[C]//ACL-IJCNLP 2021-59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confe-rence on Natural Language Processing,Proceedings of the Confe-rence.2021:4921-4933. [32]CUI G,HU S,DING N,et al.Prototypical Verbalizer forPrompt-based Few-shot Tuning[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:7014-7024. [33]SUN X,YANG Y,LIU Y.External Knowledge Enhancing Meta-learning Framework for Few-Shot Text Classification via Contrastive Learning and Adversarial Network[C]//Asia-Paci-fic Web(APWeb) and Web-Age Information Management(WAIM) Joint International Conference on Web and Big Data.Singapore:Springer,2024:46-58. [34]DONG H,ZHANG W,CHE W.Metricprompt:Prompting model as a relevance metric for few-shot text classification[C]//Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.2023:426-436. [35]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal Loss for Dense Object Detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,42(2):318-327. [36]SZEGEDY C,VANHOUCKE V,IOFFE S,et al.Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:2818-2826. [37]CIPOLLA R,GAL Y,KENDALL A.Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE Computer Society,2018:7482-7491. [38]PASZKE A,GROSS S,MASSA F,et al.PyTorch:an imperative style,high-performance deep learning library[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.2019:8026-8037. [39]WOLF T,DEBUT L,SANH V,et al.Transformers:State-of-the-art natural language processing[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing:System Demonstrations.2020:38-45. [40]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training ofdeep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4171-4186. [41]LOSHCHILOV I,HUTTER F.Decoupled weight decay regularization[J].arXiv:1711.05101,2017. [42]ZHANG H,ZHANG X,HUANG H,et al.Prompt-based meta-learning for few-shot text classification[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.2022:1342-1357. |
|
||