基于Prompt的大语言模型文本分类泛化能力研究

doi:10.11896/jsjkx.250400092

Abstract

Abstract: LLMs have advanced text classification,yet prompt-based performance varies across models,tasks and languages.This study investigates how model size,task type and category semantics shape prompt generalization.It evaluates three families－DeepSeek,Qwen and GPT-4o－on AG News,THUCNews,IMDb and ChnSentiCorp under Zero-Shot and 1/3/5-Shot settings.The results show that larger models deliver more stable Few-Shot gains,sentiment analysis benefits from Few-Shot prompts,whereas news classification prefers Zero-Shot unless high-quality examples are provided,and category representativeness and separability largely determine Prompt efficacy.Based on these insights,this study distils a four-step decision workflow and a semantics-aware guideline for Prompt design,offering practical advice for deploying LLMs in real-world classification.

Key words: Large language models, Prompts, Few-shot learning, Text classification, Generalization ability

CLC Number:

TP391

XU Rui, LIU Jin, LIU Xudong, GUAN Jian, DONG Wei. Exploring the Generalization Ability of Prompt-based Large Language Models for TextClassification[J].Computer Science, 2026, 53(6A): 250400092-7.

References

[1] MOHAJERI M M,DOUSTI M J,AHMADABADI M N.CoCoP:Enhancing Text Classification with LLM through Code Completion Prompt[J].arXiv:2411.08979,2024.
[2] SUN X,LI X,LI J,et al.Text Classification via Large Language Models[C]//Findings of the Association for Computational Linguistics:EMNLP 2023.2023:8990-9005.
[3] HE J,RUNGTA M,KOLECZEK D,et al.Does Prompt Formatting Have Any Impact on LLM Performance?[J].arXiv:2411.10541,2024.
[4] ERRICA F,SIRACUSANO G,SANVITO D,et al.What did I do wrong? quantifying LLMs' sensitivity and consistency to prompt engineering[J].arXiv:2406.12334,2024.
[5] ZHANG Y,WANG M,LI Q,et al.Pushing the limit of LLM capacity for text classification[C]//Companion Proceedings of the ACM on Web Conference 2025.2025:1524-1528.
[6] GLAZKOVA A,ZAKHAROVA O.Evaluating llm prompts for data augmentation in multi-label classification of ecological texts[J].arXiv:2411.14896,2024.
[7] ZHANG X,TALUKDAR N,VEMULAPALLI S,et al.Comparison of prompt engineering and fine-tuning strategies in large language models in the classification of clinical notes[J].AMIA Summits on Translational Science Proceedings,2024,2024:478.
[8] SAKAI H,LAM S S.QUAD-LLM-MLTC:Large LanguageModels Ensemble Learning for Healthcare Text Multi-Label Classification[J].arXiv:2502.14189,2025.
[9] GUO Y,OVADJE A,AL-GARADI M A,et al.Evaluating large language models for health-related text classification tasks with public social media data[J].Journal of the American Medical Informatics Association,2024,31(10):2181-2189.
[10] LIU M,SHI G.Poliprompt:A high-performance cost-effectivellm-based text classification framework for political science[J].arXiv:2409.01466,2024.
[11] PARIZI A H,LIU Y,NOKKU P,et al.A Comparative Study of Prompting Strategies for Legal Text Classification[C]//Proceedings of the Natural Legal Language Processing Workshop 2023.2023:258-265.
[12] CRUICKSHANK I J,NG L H X.Prompting and fine-tuningopen-sourced large language models for stance classification[J].arXiv:2309.13734,2023.
[13] YIN K,LIU C,MOSTAFAVI A,et al.Crisissense-llm:Instruction fine-tuned large language model for multi-label social media text classification in disaster informatics[J].arXiv:2406.15477,2024.
[14] LIU M,BU C,BAI S,et al.Classification of Table Cells Based on LLM Prompts[C]//2024 IEEE International Conference on Systems,Man,and Cybernetics(SMC).IEEE,2024:2140-2145.
[15] VAJJALA S,SHIMANGAUD S.Text Classification in theLLM Era－Where do we stand?[J].arXiv:2502.11830,2025.
[16] KOSTINA A,DIKIAKOS M D,STEFANIDIS D,et al.Large Language Models For Text Classification:Case Study And Comprehensive Review[J].arXiv:2501.08457,2025.
[17] XU H,LOU R,DU J,et al.LLMs' Classification Performance is Overclaimed[J].arXiv:2406.16203,2024.
[18] FECHNER R,DÖRPINGHAUS J.No Train,No Pain? Assessing the Ability of LLMs for Text Classification with no Finetuning[C]//Proceedings of the Position Papers of the 19th Confe-rence on Computer Science and Intelligence Systems(FedCSIS).Belgrade,Serbia.2024:8-11.
[19] WANG Z,PANG Y,LIN Y,et al.Adaptable and Reliable Text Classification using Large Language Models[C]//2024 IEEE International Conference on Data Mining Workshops(ICDMW) 2024:67-74.
[20] LIU C,ZHANG H,ZHAO K,et al.LLMEmbed:RethinkingLightweight LLM's Genuine Function in Text Classification[C]//Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2024:7994-8004.
[21] LIU Y,YANG T,HUANG S,et al.Calibrating LLM-BasedEvaluator[C]//Proceedings of the 2024 Joint International Conference on Computational Linguistics,Language Resources and Evaluation(LREC-COLING 2024).2024:2638-2656.
[22] ZHAO Z,WALLACE E,FENG S,et al.Calibrate before use:Improving few-shot performance of language models[C]//International Conference on Machine Learning.PMLR,2021:12697-12706.
[23] KAPLAN J,MCCANDLISH S,HENIGHAN T,et al.Scaling laws for neural language models[J].arXiv:2001.08361,2020.
[24] HOFFMANN J,BORGEAUD S,MMENSCH A,et al.Training compute-optimal large language models[C]//Proceedings of the 36th International Conference on Neural Information Processing Systems.2022:30016-30030.

Related Articles 15

[1]	WEI Qing, ZHANG Yupeng, LIU Shaoxun, ZHANG Jinfeng, ZHANG Yuezhong, CHEN Haoyang. Fuzzing Driver Generation Based on Large Language Models [J]. Computer Science, 2026, 53(6A): 250400113-8.
[2]	ZHANG Yongyu, GUO Chenjuan, FEI Xueqin, LI Feng. Study on Financial Text Sentiment Analysis Method Based on Large Language Models with Market Feedback Supervision [J]. Computer Science, 2026, 53(6A): 250500073-14.
[3]	AN Yuexuan, ZHAO Xingyu. Model-agnostic Cross-domain Few-shot Learning Framework Based on Invariant Risk Minimization [J]. Computer Science, 2026, 53(6A): 250900009-8.
[4]	SHI Hongxu, LIU Yi, LIU Kun. Survey of Recommendation Systems Based on Large Language Models [J]. Computer Science, 2026, 53(6): 281-303.
[5]	WANG Shenghui, LI Teng. Innovative Automated Scoring Based on Large Language Models [J]. Computer Science, 2026, 53(5): 90-98.
[6]	LIU Suyi, LIU Qi, GAO Weibo. Agent4Stu:Efficient LLM-based Student Answer Behavior Simulation Agent [J]. Computer Science, 2026, 53(4): 347-355.
[7]	HU Junjie, CHEN Yujie, HU Yikun, WEN Cheng, CAO Jialun, MA Zhi, SU Jie, SUN Weidi, TIAN Cong, QIN Shengchao. Formal Theorem Proving Empowered by Large Language Model:Survey and Perspectives [J]. Computer Science, 2026, 53(4): 1-23.
[8]	XU Cheng, LIU Yuxuan, WANG Xin, ZHANG Cheng, YAO Dengfeng, YUAN Jiazheng. Review of Speech Disorder Assessment Methods Driven by Large Language Models [J]. Computer Science, 2026, 53(3): 307-320.
[9]	LI Wenli, FENG Xiaonian, QIAN Tieyun. Few-shot Continuous Toxicity Detection Based on Large Language Model Augmentation [J]. Computer Science, 2026, 53(3): 321-330.
[10]	CHEN Lin, MA Longxuan, ZHANG Yongbing, HUANG Yuxin, GAO Shengxiang, YU Zhengtao. Industrial Text Classification for Chinese and Vietnamese Based on Prompt Learning and AdaptiveLoss Weighting [J]. Computer Science, 2026, 53(2): 312-321.
[11]	LIU Lilong, LIU Guoming, QI Baoyuan, DENG Xueshan, XUE Dizhan, QIAN Shengsheng. Efficient Inference Techniques of Large Models in Real-world Applications:A Comprehensive Survey [J]. Computer Science, 2026, 53(1): 12-28.
[12]	SHAO Xinyi, ZHU Jingwei, ZHANG Liang. LLM-based Business Process Adaptation Method to Respond Long-tailed Changes [J]. Computer Science, 2026, 53(1): 29-38.
[13]	CHEN Zhuangzhuang, DENG Yichen, YU Dunhui, XIAO Kui. Cross-language Knowledge Graph Entity Alignment Based on Meta-learning [J]. Computer Science, 2026, 53(1): 271-277.
[14]	WANG Jia, XIA Ying, FENG Jiangfan. Few-shot Video Action Recognition Based on Two-stage Spatio-Temporal Alignment [J]. Computer Science, 2025, 52(8): 251-258.
[15]	LI Maolin, LIN Jiajie, YANG Zhenguo. Confidence-guided Prompt Learning for Multimodal Aspect-level Sentiment Analysis [J]. Computer Science, 2025, 52(7): 241-247.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Exploring the Generalization Ability of Prompt-based Large Language Models for TextClassification

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0