Computer Science ›› 2026, Vol. 53 ›› Issue (6A): 250400092-7.doi: 10.11896/jsjkx.250400092

• Artificial Intelligence • Previous Articles     Next Articles

Exploring the Generalization Ability of Prompt-based Large Language Models for TextClassification

XU Rui, LIU Jin, LIU Xudong, GUAN Jian, DONG Wei   

  1. China Information Security Research Academy Co.,Ltd.,Beijing 102209,China
    National Computer System Engineering Research Institute of China,Beijing 100083,China
  • Online:2026-06-16 Published:2026-06-12
  • About author:XU Rui,born in 1990,Ph.D,senior engineer.His main research interests include information content security,na-tural language processing,and data go-vernance.
    LIU Jin,born in 1990,master,senior engineer.Her main research interests include cybersecurity,data analysis and processing.

Abstract: LLMs have advanced text classification,yet prompt-based performance varies across models,tasks and languages.This study investigates how model size,task type and category semantics shape prompt generalization.It evaluates three families-DeepSeek,Qwen and GPT-4o-on AG News,THUCNews,IMDb and ChnSentiCorp under Zero-Shot and 1/3/5-Shot settings.The results show that larger models deliver more stable Few-Shot gains,sentiment analysis benefits from Few-Shot prompts,whereas news classification prefers Zero-Shot unless high-quality examples are provided,and category representativeness and separability largely determine Prompt efficacy.Based on these insights,this study distils a four-step decision workflow and a semantics-aware guideline for Prompt design,offering practical advice for deploying LLMs in real-world classification.

Key words: Large language models, Prompts, Few-shot learning, Text classification, Generalization ability

CLC Number: 

  • TP391
[1] MOHAJERI M M,DOUSTI M J,AHMADABADI M N.CoCoP:Enhancing Text Classification with LLM through Code Completion Prompt[J].arXiv:2411.08979,2024.
[2] SUN X,LI X,LI J,et al.Text Classification via Large Language Models[C]//Findings of the Association for Computational Linguistics:EMNLP 2023.2023:8990-9005.
[3] HE J,RUNGTA M,KOLECZEK D,et al.Does Prompt Formatting Have Any Impact on LLM Performance?[J].arXiv:2411.10541,2024.
[4] ERRICA F,SIRACUSANO G,SANVITO D,et al.What did I do wrong? quantifying LLMs' sensitivity and consistency to prompt engineering[J].arXiv:2406.12334,2024.
[5] ZHANG Y,WANG M,LI Q,et al.Pushing the limit of LLM capacity for text classification[C]//Companion Proceedings of the ACM on Web Conference 2025.2025:1524-1528.
[6] GLAZKOVA A,ZAKHAROVA O.Evaluating llm prompts for data augmentation in multi-label classification of ecological texts[J].arXiv:2411.14896,2024.
[7] ZHANG X,TALUKDAR N,VEMULAPALLI S,et al.Comparison of prompt engineering and fine-tuning strategies in large language models in the classification of clinical notes[J].AMIA Summits on Translational Science Proceedings,2024,2024:478.
[8] SAKAI H,LAM S S.QUAD-LLM-MLTC:Large LanguageModels Ensemble Learning for Healthcare Text Multi-Label Classification[J].arXiv:2502.14189,2025.
[9] GUO Y,OVADJE A,AL-GARADI M A,et al.Evaluating large language models for health-related text classification tasks with public social media data[J].Journal of the American Medical Informatics Association,2024,31(10):2181-2189.
[10] LIU M,SHI G.Poliprompt:A high-performance cost-effectivellm-based text classification framework for political science[J].arXiv:2409.01466,2024.
[11] PARIZI A H,LIU Y,NOKKU P,et al.A Comparative Study of Prompting Strategies for Legal Text Classification[C]//Proceedings of the Natural Legal Language Processing Workshop 2023.2023:258-265.
[12] CRUICKSHANK I J,NG L H X.Prompting and fine-tuningopen-sourced large language models for stance classification[J].arXiv:2309.13734,2023.
[13] YIN K,LIU C,MOSTAFAVI A,et al.Crisissense-llm:Instruction fine-tuned large language model for multi-label social media text classification in disaster informatics[J].arXiv:2406.15477,2024.
[14] LIU M,BU C,BAI S,et al.Classification of Table Cells Based on LLM Prompts[C]//2024 IEEE International Conference on Systems,Man,and Cybernetics(SMC).IEEE,2024:2140-2145.
[15] VAJJALA S,SHIMANGAUD S.Text Classification in theLLM Era-Where do we stand?[J].arXiv:2502.11830,2025.
[16] KOSTINA A,DIKIAKOS M D,STEFANIDIS D,et al.Large Language Models For Text Classification:Case Study And Comprehensive Review[J].arXiv:2501.08457,2025.
[17] XU H,LOU R,DU J,et al.LLMs' Classification Performance is Overclaimed[J].arXiv:2406.16203,2024.
[18] FECHNER R,DÖRPINGHAUS J.No Train,No Pain? Assessing the Ability of LLMs for Text Classification with no Finetuning[C]//Proceedings of the Position Papers of the 19th Confe-rence on Computer Science and Intelligence Systems(FedCSIS).Belgrade,Serbia.2024:8-11.
[19] WANG Z,PANG Y,LIN Y,et al.Adaptable and Reliable Text Classification using Large Language Models[C]//2024 IEEE International Conference on Data Mining Workshops(ICDMW) 2024:67-74.
[20] LIU C,ZHANG H,ZHAO K,et al.LLMEmbed:RethinkingLightweight LLM's Genuine Function in Text Classification[C]//Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2024:7994-8004.
[21] LIU Y,YANG T,HUANG S,et al.Calibrating LLM-BasedEvaluator[C]//Proceedings of the 2024 Joint International Conference on Computational Linguistics,Language Resources and Evaluation(LREC-COLING 2024).2024:2638-2656.
[22] ZHAO Z,WALLACE E,FENG S,et al.Calibrate before use:Improving few-shot performance of language models[C]//International Conference on Machine Learning.PMLR,2021:12697-12706.
[23] KAPLAN J,MCCANDLISH S,HENIGHAN T,et al.Scaling laws for neural language models[J].arXiv:2001.08361,2020.
[24] HOFFMANN J,BORGEAUD S,MMENSCH A,et al.Training compute-optimal large language models[C]//Proceedings of the 36th International Conference on Neural Information Processing Systems.2022:30016-30030.
[1] WEI Qing, ZHANG Yupeng, LIU Shaoxun, ZHANG Jinfeng, ZHANG Yuezhong, CHEN Haoyang. Fuzzing Driver Generation Based on Large Language Models [J]. Computer Science, 2026, 53(6A): 250400113-8.
[2] ZHANG Yongyu, GUO Chenjuan, FEI Xueqin, LI Feng. Study on Financial Text Sentiment Analysis Method Based on Large Language Models with Market Feedback Supervision [J]. Computer Science, 2026, 53(6A): 250500073-14.
[3] AN Yuexuan, ZHAO Xingyu. Model-agnostic Cross-domain Few-shot Learning Framework Based on Invariant Risk Minimization [J]. Computer Science, 2026, 53(6A): 250900009-8.
[4] SHI Hongxu, LIU Yi, LIU Kun. Survey of Recommendation Systems Based on Large Language Models [J]. Computer Science, 2026, 53(6): 281-303.
[5] WANG Shenghui, LI Teng. Innovative Automated Scoring Based on Large Language Models [J]. Computer Science, 2026, 53(5): 90-98.
[6] LIU Suyi, LIU Qi, GAO Weibo. Agent4Stu:Efficient LLM-based Student Answer Behavior Simulation Agent [J]. Computer Science, 2026, 53(4): 347-355.
[7] HU Junjie, CHEN Yujie, HU Yikun, WEN Cheng, CAO Jialun, MA Zhi, SU Jie, SUN Weidi, TIAN Cong, QIN Shengchao. Formal Theorem Proving Empowered by Large Language Model:Survey and Perspectives [J]. Computer Science, 2026, 53(4): 1-23.
[8] XU Cheng, LIU Yuxuan, WANG Xin, ZHANG Cheng, YAO Dengfeng, YUAN Jiazheng. Review of Speech Disorder Assessment Methods Driven by Large Language Models [J]. Computer Science, 2026, 53(3): 307-320.
[9] LI Wenli, FENG Xiaonian, QIAN Tieyun. Few-shot Continuous Toxicity Detection Based on Large Language Model Augmentation [J]. Computer Science, 2026, 53(3): 321-330.
[10] CHEN Lin, MA Longxuan, ZHANG Yongbing, HUANG Yuxin, GAO Shengxiang, YU Zhengtao. Industrial Text Classification for Chinese and Vietnamese Based on Prompt Learning and AdaptiveLoss Weighting [J]. Computer Science, 2026, 53(2): 312-321.
[11] LIU Lilong, LIU Guoming, QI Baoyuan, DENG Xueshan, XUE Dizhan, QIAN Shengsheng. Efficient Inference Techniques of Large Models in Real-world Applications:A Comprehensive Survey [J]. Computer Science, 2026, 53(1): 12-28.
[12] SHAO Xinyi, ZHU Jingwei, ZHANG Liang. LLM-based Business Process Adaptation Method to Respond Long-tailed Changes [J]. Computer Science, 2026, 53(1): 29-38.
[13] CHEN Zhuangzhuang, DENG Yichen, YU Dunhui, XIAO Kui. Cross-language Knowledge Graph Entity Alignment Based on Meta-learning [J]. Computer Science, 2026, 53(1): 271-277.
[14] WANG Jia, XIA Ying, FENG Jiangfan. Few-shot Video Action Recognition Based on Two-stage Spatio-Temporal Alignment [J]. Computer Science, 2025, 52(8): 251-258.
[15] LI Maolin, LIN Jiajie, YANG Zhenguo. Confidence-guided Prompt Learning for Multimodal Aspect-level Sentiment Analysis [J]. Computer Science, 2025, 52(7): 241-247.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!