Computer Science ›› 2025, Vol. 52 ›› Issue (11A): 241200033-9.doi: 10.11896/jsjkx.241200033

• Computer Software & Architecture • Previous Articles     Next Articles

Multi-agent Collaborative Code Generation Technology Driven by Large Language Models

XIA Peng, ZHANG Yijun, QI Ji   

  1. China Mobile(Suzhou) Software Technology Co.,Ltd.,Suzhou,Jiangsu 215000,China
  • Online:2025-11-15 Published:2025-11-10
  • About author:XIA Peng,born in 1988,Ph.D,senior engineer,is a member of CCF(No.U6207M).His main research interests include multimodal large models and multi-agent collaboration driven by large models.
    QI Ji,born in 1978,Ph.D,senior engineer.His main research interests include pre-trained large models,big data,and cloud computing.
  • Supported by:
    National Key Research and Development Program of China(2021YFB2801800).

Abstract: In code generation tasks,pretrained large language models and agents have become key technologies for improving the quality and efficiency of code generation.However,when facing complex programming problems,intelligent agents based on large language models still struggle to provide effective solutions.This paper proposes a framework of multi-agent collaborative code generation to solve complex programming problems through the collaboration among agents,which includes four stages:problem analysis,task planning,code generation,and code debugging.The different base model strategies for agents based on open-source LLMs are proposed and the impact on system performance is tested.Additionally,an iterative programming paradigm incorporating reflection and debugging loops is introduced to optimize code generation based on feedback from each stage.Experimental results demonstrate that the multi-agent collaborative approach achieves significant performance improvements compared to traditional direct code generation methods across multiple datasets.Particularly,the hybrid model strategy achieves optimal perfor-mance on all tested datasets.Performance on test datasets is further improved with the adoption of reflection and debugging loops.

Key words: Multi-agent system, Large language model, Natural language processing, Code generation, Chain of thought

CLC Number: 

  • TP391
[1]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[J].Advances in Neural Information Processing Systems,2017,30:5998-6008.
[2]KENTON J D M W C,TOUTANOVA L K.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of NAACL-HLT.Minneapolis,2019:4171-4186.
[3]RADFORD A,NARASIMHAN K,SALIMANS T,et al.Impro-ving language understanding by generative pretraining[EB/OL].(2018-06-11) [2024-06-11].https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf.
[4]CHANG Y,WANG X,WANG J,et al.A survey on evaluation of large language models[J].ACM Transactions on Intelligent Systems and Technology,2023,39:1-45.
[5]RADFORD A,WU J,CHILD R,et al.Language models are unsupervised multitask learners[EB/OL].(2019-02-14) [2024-06-11].https://insightcivic.s3.us-east-1.amazonaws.com/language-models.pdf.
[6]BROWN T,MANN B,RYDER N,et al.Language models arefew-shot learners[J].Advances in Neural Information Proces-sing Systems,2020,33:1877-1901.
[7]ACHIAM J,ADLER S,AGARWAL S,et al.Gpt-4 technical re-port[EB/OL].(2024-03-04) [2024-06-11].https://cdn.openai.com/papers/gpt-4.pdf.
[8]TOUVRON H,LAVRIL T,IZACARD G,et al.Llama:Openand efficient foundation language models[EB/OL].(2023-02-17) [2024-06-11].https://arxiv.org/abs/2302.13971.
[9]BAI J,BAI S,CHU Y,et al.Qwen technical report[EB/OL].(2023-09-28) [2024-06-11].https://arxiv.org/abs/2309.16609.
[10]ZENG A,LIU X,DU Z,et al.GLM-130B:An Open Bilingual Pretrained Model[C]//International Conference on Learning Representations(ICLR).Kigali Rwanda,2023.
[11]BI X,CHEN D,CHEN G,et al.Deepseek llm:Scaling open-source language models with longtermism[EB/OL].(2024-01-05) [2024-06-11].https://arxiv.org/abs/2401.02954.
[12]ROZIERE B,GEHRING J,GLOECKLE F,et al.Code llama:Open foundation models for code[EB/OL].(2024-01-31) [2024-06-11].https://arxiv.org/abs/2308.12950.
[13]LOZHKOV A,LI R,ALLAL L B,et al.StarCoder 2 and TheStack v2:The Next Generation[EB/OL].(2024-02-29) [2024-06-11].https://arxiv.org/abs/2402.19173.
[14]GUO D,ZHU Q,YANG D,et al.DeepSeek-Coder:When the Large Language Model Meets Programming-The Rise of Code Intelligence[EB/OL].(2024-01-26) [2024-06-11].https://arxiv.org/abs/2401.14196
[15]CHEN M,TWOREK J,JUN H,et al.Evaluating large language models trained on code[EB/OL].(2021-07-14) [2024-06-11].https://arxiv.org/abs/2107.03374.
[16]AUSTIN J,ODENA A,NYE M,et al.Program synthesis with large language models[EB/OL].(2021-08-16) [2024-06-11].https://arxiv.org/abs/2108.07732
[17]DONG Y,DING J,JIANG X,et al.Codescore:Evaluating code generation by learning code execution[EB/OL].(2023-12-01) [2024-06-11].https://arxiv.org/abs/2301.09043
[18]WOOLDRIDGE M,JENNINGS N R.Intelligent agents:Theory and practice[J].The knowledge Engineering Review,1995,10(2):115-152.
[19]PARK J S,O’BRIEN J,CAI C J,et al.Generative agents:Interactive simulacra of human behavior[C]//Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology.San Francisco,2023:1-22.
[20]WEI J,WANG X,SCHUURMANS D,et al.Chain-of-thoughtprompting elicits reasoning in large language models[J].Advances in Neural Information Processing Systems,2022,35:24824-24837.
[21]KOJIMA T,GU S S,REID M,et al.Large language models are zero-shot reasoners[J].Advances in Neural Information Processing Systems,2022,35:22199-22213.
[22]SONG C H,WU J,WASHINGTON C,et al.Llm-planner:Few-shot grounded planning for embodied agents with large language models[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Paris,2023:2998-3009.
[23]YAO S,ZHAO J,YU D,et al.ReAct:Synergizing Reasoningand Acting in Language Models[C]//International Conference on Learning Representations(ICLR).Kigali Rwanda,2023.
[24]SHINN N,CASSANO F,GOPINATH A,et al.Reflexion:Language agents with verbal reinforcement learning[J].Advances in Neural Information Processing Systems,2024,36.
[25]BROWN T,MANN B,RYDER N,et al.Language models arefew-shot learners[J].Advances in Neural Information Proces-sing Systems,2020,33:1877-1901.
[26]WEI J,BOSMA M,ZHAO V,et al.Finetuned Language Models are Zero-Shot Learners[C]//International Conference on Lear-ning Representations(ICLR).2021.
[27]LI G,HAMMOUD H,ITANI H,et al.Camel:Communicativeagents for “mind” exploration of large language model society[J].Advances in Neural Information Processing Systems,2023,36:51991-52008.
[28]RUAN J,CHEN Y H,ZHANG B,et al.TPTU:Task Planning and Tool Usage of Large Language Model-based AI Agents[C]//NeurIPS 2023 Foundation Models for Decision Making Workshop.New Orleans,2023.
[29] JIANG X,DONG Y,WANG L,et al.Self-planning code generation with large language model[EB/OL].(2024-05-31) [2024-06-11].https://arxiv.org/abs/2303.06689.
[30]SCHICK T,JANE A Y,JIANG Z,et al.PEER:A Collaborative Language Model[C]//International Conference on Learning Representations(ICLR).2022.
[31]WU Q,BANSAL G,ZHANG J,et al.Autogen:Enabling next-gen LLM applications via multi-agent conversation framework[EB/OL].(2023-10-03) [2024-06-11].https://arxiv.org/abs/2308.08155.
[32]HONG S,ZHUGE M,CHEN J,et al.MetaGPT:Meta Pro-gramming for Multi-Agent Collaborative Framework[C]//International Conference on Learning Representations(ICLR).Vienna,2024.
[1] WANG Zhibin, LI Shipeng, ZHOU Yuhang, LI Xue, ZHANG Zhonghui, JIANG Zhiwei, GU Rong, TIAN Chen, CHEN Guihai, ZHONG Sheng. Optimization of Service Level Objectives and System Level Metrics in Large Language ModelServing System [J]. Computer Science, 2026, 53(3): 23-32.
[2] ZHOU Yueyuan, LU Guanze, XIANG Jiawei, ZHANG Jiawei, SHAO En, HE Xin. Training System for Large Language Models Based on Adaptive Transpose on Hygon DCU [J]. Computer Science, 2026, 53(3): 33-40.
[3] CHEN Han, XU Zefeng, JIANG Jiu, FAN Fan, ZHANG Junjian, HE Chu, WANG Wenwei. Large Language Model and Deep Network Based Cognitive Assessment Automatic Diagnosis [J]. Computer Science, 2026, 53(3): 41-51.
[4] QIAN Qing, CHEN Huicheng, CUI Yunhe, TANG Ruixue, FU Jinmei. Joint Entity and Relation Extraction Method with Multi-scale Collaborative Aggregation and Axial-semantic Guidance [J]. Computer Science, 2026, 53(3): 97-106.
[5] WU Xianjie, LI Tongliang, LI Zhoujun. Survey of Table Question Answering Research [J]. Computer Science, 2026, 53(3): 295-306.
[6] XU Cheng, LIU Yuxuan, WANG Xin, ZHANG Cheng, YAO Dengfeng, YUAN Jiazheng. Review of Speech Disorder Assessment Methods Driven by Large Language Models [J]. Computer Science, 2026, 53(3): 307-320.
[7] LI Wenli, FENG Xiaonian, QIAN Tieyun. Few-shot Continuous Toxicity Detection Based on Large Language Model Augmentation [J]. Computer Science, 2026, 53(3): 321-330.
[8] CHEN Yuyin, LI Guanfeng, QIN Jing, XIAO Yuhang. Survey on Complex Logical Query Methods in Knowledge Graphs [J]. Computer Science, 2026, 53(2): 273-288.
[9] SUN Mingxu, LIANG Gang, WU Yifei, HU Haixin. Chinese Hate Speech Detection Incorporating Hate Object Features and Variant Word Restoration Mechanism [J]. Computer Science, 2026, 53(2): 289-299.
[10] XIE Guangqiang, QIU Fengyang, LI Yang. Fast Consensus Seeking in Distributed Multi-agent System Using Topology Virtual Structural Hole Node [J]. Computer Science, 2026, 53(2): 358-366.
[11] GUO Luxiang, WANG Yueyu, LI Qianyue, LI Shasha, LIU Xiaodong, JI Bin, YU Jie. Comprehensive Survey of LLM-based Agent Operating Systems [J]. Computer Science, 2026, 53(1): 1-11.
[12] LIU Lilong, LIU Guoming, QI Baoyuan, DENG Xueshan, XUE Dizhan, QIAN Shengsheng. Efficient Inference Techniques of Large Models in Real-world Applications:A Comprehensive Survey [J]. Computer Science, 2026, 53(1): 12-28.
[13] SHAO Xinyi, ZHU Jingwei, ZHANG Liang. LLM-based Business Process Adaptation Method to Respond Long-tailed Changes [J]. Computer Science, 2026, 53(1): 29-38.
[14] CAI Qihang, XU Bin, DONG Xiaodi. Knowledge Graph Completion Model Using Semantically Enhanced Prompts and Structural Information [J]. Computer Science, 2025, 52(9): 282-293.
[15] ZHONG Boyang, RUAN Tong, ZHANG Weiyan, LIU Jingping. Collaboration of Large and Small Language Models with Iterative Reflection Framework for Clinical Note Summarization [J]. Computer Science, 2025, 52(9): 294-302.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!