大语言模型驱动的多智能体协同代码生成技术

doi:10.11896/jsjkx.241200033

计算机科学 ›› 2025, Vol. 52 ›› Issue (11A): 241200033-9.doi: 10.11896/jsjkx.241200033

大语言模型驱动的多智能体协同代码生成技术

夏鹏, 张燚钧, 齐骥

中移(苏州)软件技术有限公司江苏苏州 215000

出版日期:2025-11-15 发布日期:2025-11-10
通讯作者: 齐骥(qiji@cmss.chinamobile.com)
作者简介:actpoar@hotmail.com
基金资助:
国家重点研发计划(2021YFB2801800)

Multi-agent Collaborative Code Generation Technology Driven by Large Language Models

XIA Peng, ZHANG Yijun, QI Ji

China Mobile(Suzhou) Software Technology Co.,Ltd.,Suzhou,Jiangsu 215000,China

Online:2025-11-15 Published:2025-11-10
Supported by:
National Key Research and Development Program of China(2021YFB2801800).

摘要/Abstract

摘要： 在代码生成任务中,预训练大语言模型和智能体已经成为提升代码生成质量和效率的关键技术。在面对复杂的编程问题时,基于大语言模型的智能体技术目前仍无法有效处理和解决。对此,提出一种多智能体协同代码生成框架,构建了包含问题分析、任务规划、代码生成和代码调试4个阶段的系统,通过智能体间的协作解决复杂的编程问题,并且基于开源大模型提出了不同的智能体基础模型使用策略,验证其对系统整体表现的影响。在此基础上,引入包含反思和调试循环的迭代式编程范式,以根据各阶段的结果反馈优化代码生成。实验结果表明,相比于传统直接代码生成方法,多智能体协同方案在多个数据集上取得了显著的性能提升,尤其在采用混合模型策略时,在所有测试数据集上均达到了最优表现。采用反思和调试循环时,在测试数据集上的表现有进一步提升。

关键词: 多智能体系统, 大语言模型, 自然语言处理, 代码生成, 思维链

Abstract: In code generation tasks,pretrained large language models and agents have become key technologies for improving the quality and efficiency of code generation.However,when facing complex programming problems,intelligent agents based on large language models still struggle to provide effective solutions.This paper proposes a framework of multi-agent collaborative code generation to solve complex programming problems through the collaboration among agents,which includes four stages:problem analysis,task planning,code generation,and code debugging.The different base model strategies for agents based on open-source LLMs are proposed and the impact on system performance is tested.Additionally,an iterative programming paradigm incorporating reflection and debugging loops is introduced to optimize code generation based on feedback from each stage.Experimental results demonstrate that the multi-agent collaborative approach achieves significant performance improvements compared to traditional direct code generation methods across multiple datasets.Particularly,the hybrid model strategy achieves optimal perfor-mance on all tested datasets.Performance on test datasets is further improved with the adoption of reflection and debugging loops.

Key words: Multi-agent system, Large language model, Natural language processing, Code generation, Chain of thought

中图分类号:

TP391

夏鹏, 张燚钧, 齐骥. 大语言模型驱动的多智能体协同代码生成技术[J]. 计算机科学, 2025, 52(11A): 241200033-9. https://doi.org/10.11896/jsjkx.241200033

XIA Peng, ZHANG Yijun, QI Ji. Multi-agent Collaborative Code Generation Technology Driven by Large Language Models[J]. Computer Science, 2025, 52(11A): 241200033-9. https://doi.org/10.11896/jsjkx.241200033

参考文献

[1]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[J].Advances in Neural Information Processing Systems,2017,30:5998-6008.
[2]KENTON J D M W C,TOUTANOVA L K.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of NAACL-HLT.Minneapolis,2019:4171-4186.
[3]RADFORD A,NARASIMHAN K,SALIMANS T,et al.Impro-ving language understanding by generative pretraining[EB/OL].(2018-06-11) [2024-06-11].https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf.
[4]CHANG Y,WANG X,WANG J,et al.A survey on evaluation of large language models[J].ACM Transactions on Intelligent Systems and Technology,2023,39:1-45.
[5]RADFORD A,WU J,CHILD R,et al.Language models are unsupervised multitask learners[EB/OL].(2019-02-14) [2024-06-11].https://insightcivic.s3.us-east-1.amazonaws.com/language-models.pdf.
[6]BROWN T,MANN B,RYDER N,et al.Language models arefew-shot learners[J].Advances in Neural Information Proces-sing Systems,2020,33:1877-1901.
[7]ACHIAM J,ADLER S,AGARWAL S,et al.Gpt-4 technical re-port[EB/OL].(2024-03-04) [2024-06-11].https://cdn.openai.com/papers/gpt-4.pdf.
[8]TOUVRON H,LAVRIL T,IZACARD G,et al.Llama:Openand efficient foundation language models[EB/OL].(2023-02-17) [2024-06-11].https://arxiv.org/abs/2302.13971.
[9]BAI J,BAI S,CHU Y,et al.Qwen technical report[EB/OL].(2023-09-28) [2024-06-11].https://arxiv.org/abs/2309.16609.
[10]ZENG A,LIU X,DU Z,et al.GLM-130B:An Open Bilingual Pretrained Model[C]//International Conference on Learning Representations(ICLR).Kigali Rwanda,2023.
[11]BI X,CHEN D,CHEN G,et al.Deepseek llm:Scaling open-source language models with longtermism[EB/OL].(2024-01-05) [2024-06-11].https://arxiv.org/abs/2401.02954.
[12]ROZIERE B,GEHRING J,GLOECKLE F,et al.Code llama:Open foundation models for code[EB/OL].(2024-01-31) [2024-06-11].https://arxiv.org/abs/2308.12950.
[13]LOZHKOV A,LI R,ALLAL L B,et al.StarCoder 2 and TheStack v2:The Next Generation[EB/OL].(2024-02-29) [2024-06-11].https://arxiv.org/abs/2402.19173.
[14]GUO D,ZHU Q,YANG D,et al.DeepSeek-Coder:When the Large Language Model Meets Programming-The Rise of Code Intelligence[EB/OL].(2024-01-26) [2024-06-11].https://arxiv.org/abs/2401.14196
[15]CHEN M,TWOREK J,JUN H,et al.Evaluating large language models trained on code[EB/OL].(2021-07-14) [2024-06-11].https://arxiv.org/abs/2107.03374.
[16]AUSTIN J,ODENA A,NYE M,et al.Program synthesis with large language models[EB/OL].(2021-08-16) [2024-06-11].https://arxiv.org/abs/2108.07732
[17]DONG Y,DING J,JIANG X,et al.Codescore:Evaluating code generation by learning code execution[EB/OL].(2023-12-01) [2024-06-11].https://arxiv.org/abs/2301.09043
[18]WOOLDRIDGE M,JENNINGS N R.Intelligent agents:Theory and practice[J].The knowledge Engineering Review,1995,10(2):115-152.
[19]PARK J S,O’BRIEN J,CAI C J,et al.Generative agents:Interactive simulacra of human behavior[C]//Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology.San Francisco,2023:1-22.
[20]WEI J,WANG X,SCHUURMANS D,et al.Chain-of-thoughtprompting elicits reasoning in large language models[J].Advances in Neural Information Processing Systems,2022,35:24824-24837.
[21]KOJIMA T,GU S S,REID M,et al.Large language models are zero-shot reasoners[J].Advances in Neural Information Processing Systems,2022,35:22199-22213.
[22]SONG C H,WU J,WASHINGTON C,et al.Llm-planner:Few-shot grounded planning for embodied agents with large language models[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Paris,2023:2998-3009.
[23]YAO S,ZHAO J,YU D,et al.ReAct:Synergizing Reasoningand Acting in Language Models[C]//International Conference on Learning Representations(ICLR).Kigali Rwanda,2023.
[24]SHINN N,CASSANO F,GOPINATH A,et al.Reflexion:Language agents with verbal reinforcement learning[J].Advances in Neural Information Processing Systems,2024,36.
[25]BROWN T,MANN B,RYDER N,et al.Language models arefew-shot learners[J].Advances in Neural Information Proces-sing Systems,2020,33:1877-1901.
[26]WEI J,BOSMA M,ZHAO V,et al.Finetuned Language Models are Zero-Shot Learners[C]//International Conference on Lear-ning Representations(ICLR).2021.
[27]LI G,HAMMOUD H,ITANI H,et al.Camel:Communicativeagents for “mind” exploration of large language model society[J].Advances in Neural Information Processing Systems,2023,36:51991-52008.
[28]RUAN J,CHEN Y H,ZHANG B,et al.TPTU:Task Planning and Tool Usage of Large Language Model-based AI Agents[C]//NeurIPS 2023 Foundation Models for Decision Making Workshop.New Orleans,2023.
[29] JIANG X,DONG Y,WANG L,et al.Self-planning code generation with large language model[EB/OL].(2024-05-31) [2024-06-11].https://arxiv.org/abs/2303.06689.
[30]SCHICK T,JANE A Y,JIANG Z,et al.PEER:A Collaborative Language Model[C]//International Conference on Learning Representations(ICLR).2022.
[31]WU Q,BANSAL G,ZHANG J,et al.Autogen:Enabling next-gen LLM applications via multi-agent conversation framework[EB/OL].(2023-10-03) [2024-06-11].https://arxiv.org/abs/2308.08155.
[32]HONG S,ZHUGE M,CHEN J,et al.MetaGPT:Meta Pro-gramming for Multi-Agent Collaborative Framework[C]//International Conference on Learning Representations(ICLR).Vienna,2024.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

大语言模型驱动的多智能体协同代码生成技术

Multi-agent Collaborative Code Generation Technology Driven by Large Language Models

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 0

Metrics

本文评价

推荐阅读 0