基于大语言模型自身的提示语公平性自动优化与评估

doi:10.11896/jsjkx.240900008

Abstract

Abstract: With the rapid development of large language models,the issue of model fairness has garnered increasing attention,primarily focusing on biases in generated text and downstream tasks.To produce fairer text,careful design and examination of the fairness of prompts are necessary.This study employs four Chinese large language models as optimizers to automatically and ite-ratively generate fair prompts that describe both advantaged and disadvantaged groups.Additionally,it investigates the impact of variables such as model temperature,initial prompt types,and optimized directions on the optimization process,while assessing the fairness of various prompt styles,including chain-of-thought and persona.The results indicate that large language models can effectively generate prompts that are either less biased or more biased,with prompts for advantaged groups performing better at lower temperature settings.Generating biased prompts is relatively more challenging,with the models employing anti-adversarial strategies to tackle this task.Using questions as initial prompts can yield outputs that are more random yet of higher quality.Different models exhibit distinct optimization strategies,with chain-of-thought and debiasing styles producing fairer text.Prompts play a crucial role in model fairness and warrant further investigation into their fairness.

Key words: Large language model, Prompt, Fairness, Automatic evaluation, Self-optimization

CLC Number:

TP391

ZHU Shucheng, HUO Hongying, WANG Weikang, LIU Ying, LIU Pengyuan. Automatic Optimization and Evaluation of Prompt Fairness Based on Large Language Model Itself[J].Computer Science, 2025, 52(4): 240-248.

References

[1]ZHOU X,ZHU H,MATHUR L,et al.SOTOPIA:Interactive Evaluation for Social Intelligence in Language Agents[C]//The Twelfth International Conference on Learning Representations.2024.
[2]ZHOU Y,MURESANU A I,HAN Z,et al.Large LanguageModels are Human-Level Prompt Engineers[C]//The Eleventh International Conference on Learning Representations.2023.
[3]PRYZANT R,ITER D,LI J,et al.Automatic Prompt Optimization with “Gradient Descent” and Beam Search[C]//Procee-dings of the 2023 Conference on Empirical Methods in Natural Language Processing.2023:7957-7968.
[4]YANG C,WANG X,LU Y,et al.Large Language Models asOptimizers[C]//The Twelfth International Conference on Learning Representations.2024.
[5]KOJIMA T,GU S S,REID M,et al.Large language models are zero-shot reasoners[J].Advances in Neural Information Processing Systems,2022,35:22199-22213.
[6]SHAIKH O,ZHANG H,HELD W,et al.On Second Thought,Let’s Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2023:4454-4470.
[7]CALISKAN A,BRYSON J J,NARAYANAN A.Semantics derived automatically from language corpora contain human-like biases[J].Science,2017,356(6334):183-186.
[8]HADA R,SETH A,DIDDEE H,et al.“Fifty Shades of Bias”:Normative Ratings of Gender Bias in GPT Generated English Text[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.2023:1862-1876.
[9]VENKIT P N,GAUTAM S,PANCHANADIKAR R,et al.Nationality Bias in Text Generation[C]//Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics.2023:116-122.
[10]CHENG M,DURMUS E,JURAFSKY D.Marked Personas:Using Natural Language Prompts to Measure Stereotypes in Language Models[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2023:1504-1532.
[11]ZHU S,WANG W,LIU Y.Quite Good,but Not Enough:Na-tionality Bias in Large Language Models-a Case Study of ChatGPT[C]//Proceedings of the 2024 Joint International Conference on Computational Linguistics,Language Resources and Evaluation(LREC-COLING 2024).2024:13489-13502.
[12]FENG S,PARK C Y,LIU Y,et al.From Pretraining Data toLanguage Models to Downstream Tasks:Tracking the Trails of Political Biases Leading to Unfair NLP Models[C]//Procee-dings of the 61st Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2023:11737-11762.
[13]KIRK H R,JUN Y,VOLPIN F,et al.Bias out-of-the-box:An empirical analysis of intersectional occupational biases in popular generative language models[J].Advances in neural information processing systems,2021,34:2611-2624.
[14]GUO Q,WANG R,GUO J,et al.Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers[C]//The Twelfth International Conference on Learning Representations.2024.
[15]YE Q,AHMED M,PRYZANT R,et al.Prompt engineering a prompt engineer[J].arXiv:2311.05661,2023.
[16]HSIEH C J,SI S,YU F X,et al.Automatic engineering of long prompts[J].arXiv:2311.10117,2023.
[17]CHENG J,LIU X,ZHENG K,et al.Black-box prompt optimization:Aligning large language models without model training[J].arXiv:2311.04155,2023.
[18]WANG X,LI C,WANG Z,et al.PromptAgent:Strategic Planning with Langu-age Models Enables Expert-level Prompt Optimization[C]//The Twelfth International Conference on Lear-ning Representations.2024.
[19]YAO H,ZHANG R,YU L,et al.SEP:Self-Enhanced Prompt Tuning for Visual-Language Model[J].arXiv:2405.15549,2024.
[20]PENG K,DING L,ZHONG Q,et al.Towards Making the Most of ChatGPT for Machine Translation[C]//Findings of the Association for Computational Linguistics:EMNLP 2023.2023:5622-5633.
[21]SHEN X,CHEN Z,BACKES M,et al.In chatgpt we trust？measuring and characterizing the reliability of chatgpt[J].ar-Xiv:2304.08979,2023.
[22]BECK T,SCHUFF H,LAUSCHER A,et al.Sensitivity,per-formance,robustness:Deconstructing the effect of sociodemographic prompting[C]//Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics(Volume 1:Long Papers).2024:2589-2615.
[23]TAMKIN A,ASKELL A,LOVITT L,et al.Evaluating andmitigating discrimination in language model decisions[J].arXiv:2312.03689,2023.
[24]LI C,WANG J,ZHANG Y,et al.The Good,The Bad,andWhy:Unveiling Emotions in Generative AI[C]//Forty-first International Conference on Machine Learning.2024.
[25]LI C,WANG J,ZHANG Y,et al.Large language models understand and can be enhanced by emotional stimuli[J].arXiv:2307.11760,2023.
[26]GEHMAN S,GURURANGAN S,SAP M,et al.RealToxici-tyPrompts:Evaluating Neural Toxic Degeneration in Language Models[C]//Findings of the Association for Computational Linguistics:EMNLP 2020.2020:3356-3369.
[27]LIU Y,YU J,SUN H,et al.Efficient Detection of ToxicPrompts in Large Language Models[J].arXiv:2408.11727,2024.
[28]GUPTA S,SHRIVASTAVA V,DESHPANDE A,et al.BiasRuns Deep:Implicit Reasoning Biases in Persona-Assigned LLMs[C]//The Twelfth International Conference on Learning Representations.2024.
[29]WALLACE E,FENG S,KANDPAL N,et al.Universal Adversarial Triggers for Attacking and Analyzing NLP[C]//Proceed-ings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).2019:2153-2162.
[30]ZHU S,LIU Y.Offensiveness Analysis of Chinese Group Ad-dressing Terms and Dataset Construction[C]//Workshop on Chinese Lexical Semantics.Singapore:Springer Nature Singapore,2023:342-356.
[31]GILARDI F,ALIZADEH M,KUBLI M.ChatGPT outperforms crowd workers for text-annotation tasks[J].Proceedings of the National Academy of Sciences,2023,120(30):e2305016120.
[32]GONEN H,IYER S,BLEVINSl T,et al.Demystifying Prompts in Language Models via Perplexity Estimation[C]//Findings of the Association for Computational Linguistics:EMNLP 2023.2023:10136-10148.
[33]HONG J,LEE N,THORNE J.Reference-free monolithic pre-ference optimization with odds ratio[J].arXiv:2403.07691,2024.
[34]WANG B,CHEN W,PEI H,et al.DecodingTrust:A Comprehensive Assessment of Trustworthiness in GPT Models[C]//Proceedings of the 37th International Conference onNeural Information Processing Systems.2024:31232-31339.

Related Articles 15

[1]	CHEN Xuhao, HU Sipeng, LIU Hongchao, LIU Boran, TANG Dan, ZHAO Di. Research on LLM Vector Dot Product Acceleration Based on RISC-V Matrix Instruction Set Extension [J]. Computer Science, 2025, 52(5): 83-90.
[2]	CONG Yingnan, HAN Linrui, MA Jiayu, ZHU Jinqing. Research on Intelligent Judgment of Criminal Cases Based on Large Language Models [J]. Computer Science, 2025, 52(5): 248-259.
[3]	SONG Xingnuo, WANG Congyan, CHEN Mingkai. Survey on 3D Scene Reconstruction Techniques in Metaverse [J]. Computer Science, 2025, 52(3): 17-32.
[4]	CHENG Dawei, WU Jiaxuan, LI Jiangtong, DING Zhijun, JIANG Changjun. Study on Evaluation Framework of Large Language Model’s Financial Scenario Capability [J]. Computer Science, 2025, 52(3): 239-247.
[5]	HUANG Xueqin, ZHANG Sheng, ZHU Xianqiang, ZHANG Qianzhen, ZHU Cheng. Generative Task Network:New Paradigm for Autonomic Task Planning and Execution Based on LLM [J]. Computer Science, 2025, 52(3): 248-259.
[6]	WANG Tianyi, LIN Youfang, GONG Letian, CHEN Wei, GUO Shengnan, WAN Huaiyu. Check-in Trajectory and User Linking Based on Natural Language Augmentation [J]. Computer Science, 2025, 52(2): 99-106.
[7]	XU Siyao, ZENG Jianjun, ZHANG Weiyan, YE Qi, ZHU Yan. Dependency Parsing for Chinese Electronic Medical Record Enhanced by Dual-scale Collaboration of Large and Small Language Models [J]. Computer Science, 2025, 52(2): 253-260.
[8]	ZENG Zefan, HU Xingchen, CHENG Qing, SI Yuehang, LIU Zhong. Survey of Research on Knowledge Graph Based on Pre-trained Language Models [J]. Computer Science, 2025, 52(1): 1-33.
[9]	DUN Jingbo, LI Zhuo. Survey on Transmission Optimization Technologies for Federated Large Language Model Training [J]. Computer Science, 2025, 52(1): 42-55.
[10]	ZHENG Mingqi, CHEN Xiaohui, LIU Bing, ZHANG Bing, ZHANG Ran. Survey of Chain-of-Thought Generation and Enhancement Methods in Prompt Learning [J]. Computer Science, 2025, 52(1): 56-64.
[11]	LI Tingting, WANG Qi, WANG Jiakang, XU Yongjun. SWARM-LLM:An Unmanned Swarm Task Planning System Based on Large Language Models [J]. Computer Science, 2025, 52(1): 72-79.
[12]	YAN Yusong, ZHOU Yuan, WANG Cong, KONG Shengqi, WANG Quan, LI Minne, WANG Zhiyuan. COA Generation Based on Pre-trained Large Language Models [J]. Computer Science, 2025, 52(1): 80-86.
[13]	CHENG Zhiyu, CHEN Xinglin, WANG Jing, ZHOU Zhongyuan, ZHANG Zhizheng. Retrieval-augmented Generative Intelligence Question Answering Technology Based on Knowledge Graph [J]. Computer Science, 2025, 52(1): 87-93.
[14]	LIU Changcheng, SANG Lei, LI Wei, ZHANG Yiwen. Large Language Model Driven Multi-relational Knowledge Graph Completion Method [J]. Computer Science, 2025, 52(1): 94-101.
[15]	MO Shuyuan, MENG Zuqiang. Multimodal Sentiment Analysis Model Based on Visual Semantics and Prompt Learning [J]. Computer Science, 2024, 51(9): 250-257.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Automatic Optimization and Evaluation of Prompt Fairness Based on Large Language Model Itself

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0