计算机科学 ›› 2026, Vol. 53 ›› Issue (1): 1-11.doi: 10.11896/jsjkx.250500002

• 大语言模型技术研究及应用 • 上一篇    下一篇

大语言模型智能体操作系统研究综述

郭陆祥*, 王越余*, 李芊玥*, 李莎莎, 刘晓东, 纪斌, 余杰   

  1. 国防科学技术大学计算机学院 长沙 410073
  • 收稿日期:2025-05-06 修回日期:2025-07-22 发布日期:2026-01-08
  • 通讯作者: 李莎莎(shashali@nudt.edu.cn)
  • 作者简介:(lxg@nudt.edu.cn;wangyueyu@nudt.edu.cn;li_qianyue@nudt.edu.cn)∗表示该作者对本文有同等重要的贡献。
  • 基金资助:
    国家重点研发计划(2024YFB4506200)

Comprehensive Survey of LLM-based Agent Operating Systems

GUO Luxiang, WANG Yueyu, LI Qianyue, LI Shasha, LIU Xiaodong, JI Bin, YU Jie   

  1. College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China
  • Received:2025-05-06 Revised:2025-07-22 Online:2026-01-08
  • About author:GUO Luxiang,born in 1995,Ph.D candidata,is a member of CCF(No.Y4591G).His main research interests include artificial intelligence,large language models,agent,and agent operation systems.
    WANG Yueyu,born in 2000,graduate.His main research interests include artificial intelligence,large language mo-dels,agent,and agent operating systems.
    LI Qianyue,born in 2002,Ph.D candidata.Her main research interests include artificial intelligence,large language models,agent,and agent operating system.
    LI Shasha,born in 1982,Ph.D,associate professor,Ph.D supervisor.Her main research interests include artificial intelligence,large language models,agent,and agent operation systems.
  • Supported by:
    National Key Research and Development Program of China(2024YFB4506200).

摘要: 大语言模型智能体操作系统,也叫智能体操作系统,是整合大模型、工具资源以及多智能体协同的核心平台,目前正逐渐成为推动通用人工智能发展的一个关键研究方向。对智能体操作系统领域的研究进展进行了系统梳理,首先从基础理论着手,回顾了多种大模型的演进情况以及智能体和传统操作系统领域的进展;接着,围绕典型体系结构,如AIOS等,阐述了其分层架构与模块化设计是怎样达成资源管理与智能调度的。进一步地,明确了当前智能体操作系统在上下文整合、扩展性以及安全性等方面面临的技术挑战,同时也提出了未来借助轻量化设计、自监督学习机制以及动态调度算法来提升多智能体协作效率。该研究的主要贡献为,将那些分散的研究给予整合,促使技术框架变得更为明晰,并指出了智能体操作系统对新兴体系以及行业定制化实践覆盖不全面的情况。未来的研究需要侧重推动跨域智能体操作系统自我进化的能力,并且加快其在各个领域的落地等。

关键词: 大语言模型, 智能体操作系统, 通用人工智能, 智能体, 传统操作系统

Abstract: Large language model-based agent operating systems(Agent OS),as core platforms for integrating large models,tool resources,and multi-agent collaboration,are gradually becoming a key research direction for advancing general artificial intelligence.This paper systematically reviews the research progress in the field of Agent OS.It begins by discussing foundational theories,reviewing the evolution of various large language models,and progress in agent technology and traditional operating systems.This paper then elaborates on how their hierarchical architectures and modular designs achieve resource management and intelligent scheduling,focusing on typical architectures such as AIOS.Furthermore,it clarifies existing technical bottlenecks in scalability,context integration,and security within current systems.It also proposes future directions,including the use of lightweight designs,self-supervised learning mechanisms,and dynamic scheduling algorithms to optimize multi-agent cooperation efficiency.The main contributions of this paper are integrating fragmented research to provide a clearer technical framework,and highlighting the current limitations of Agent OS in covering emerging applications and industry-specific customizations.Future work should focus on enhancing the capability of cross-domain Agent OS for self-evolution and accelerating their implementation across diverse fields.

Key words: Large language model, Agent OS, General artificial intelligence, Agent, Traditional operating system

中图分类号: 

  • TP316
[1]REN H,SHI W,BAI Q.Generative artificial intelligence em-powers the open construction of digital memory in libraries:coupling logic,application scenarios,and implementation pathways[J].Library Science Research,2025(2):44-52.
[2]BAI X.Research on Multi-Contrast Brain MR Image Generation Based on Generative Adversarial Networks[D].Linyi:Linyi University,2024.
[3]CHEN B,KANG J,ZHONG P,et al.Survey on Object GoalNavigation for Embodied AI[J].Journal of Software,2025,36(4):1715-1757.
[4]HUANG H,LIANG Y,FU S,et al.Intelligent Taxiing Scheduling Method for Airport Aircraft Based on Multi-Agent Reinforcement Learning[J].Command Information System and Technology,2023,14(5):30-36.
[5]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017.
[6]TAY Y,DEHGHANI M,BAHRI D,et al.Efficient transfor-mers:A survey[J].ACM Computing Surveys,2022,55(6):1-28.
[7]LU X,LI J,TAO S,et al.Survey on Document-level Neural Machine Translation[J].Journal of Software,2025,36(1):152-183.
[8]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.ACL,2019:4171-4186.
[9]BROWN T,MANN B,RYDER N,et al.Language models are few-shot learners[J].Advances in Neural Information Proces-sing Systems,2020,33:1877-901.
[10]ACHIAM J,ADLER S,AGARWAL S,et al.GPT-4 Technical Report[J].arXiv:2303.08774,2023.
[11]LIU Y.Research on Reinforcement Learning Algorithms forMulti-agent Relation Modeling and Role Generation[D].Chang-sha:National University of Defense Technology,2022.
[12]SCHICK T,DWIVEDI-YU J,DESSÌ R,et al.Toolformer:Language models can teach themselves to use tools[J].Advances in Neural Information Processing Systems,2023,36:68539-51.
[13]DRIESS D,XIA F,SAJJADI M S,et al.PaLM-E:An Embodied Multimodal Language Model[J].arXiv:2303.03378,2023.
[14]WANG L,MA C,FENG X,et al.A survey on large language model based autonomous agents[J].Frontiers of Computer Science,2024,18(6):186345.
[15]YANG N.Research on Multi-Agent Reinforcement LearningTechnology for Complex State Space Scenarios[D].Changsha:National University of Defense Technology,2022.
[16]ZHANG M,JIN Z,LIU K.Counterfactual Regret Advantage-based Self-play Approach for Mixed Cooperative-competitive Multi-agent Systems[J].Journal of Software,2024,35(2):739-757.
[17]LIU X,YU H,ZHANG H,et al.AgentBench:Evaluating LLMs as Agents[J].arXiv:2308.03688,2023.
[18]OLANIYAN R,FADAHUNSI O,MAHESWARAN M,et al.Opportunistic edge computing:Concepts,opportunities and research challenges[J].Future Generation Computer Systems,2018,89:633-45.
[19]PETERSON J L,SILBERSCHATZ A.Operating System Concepts[M].New York:Addison-Wesley,1985.
[20]YANG P,DONG P,JIANG Z,et al.Novel and Universal OS Structure Model Based on Hierarchical Software Bus[J].Journal of Software,2024,35(10):4930-4947.
[21]TANENBAUM A S,BOS H.Modern operating systems[M].Pearson Education Inc.,2015.
[22]DORRI A,KANHERE S S,JURDAK R.Multi-agent systems:A survey[J].IEEE Access,2018,6:28573-93.
[23]GUO T,CHEN X,WANG Y,et al.Large Language ModelBased Multi-Agents:A Survey of Progress and Challenges[J].arXiv:2402.01680,2024.
[24]WU Q,BANSAL G,ZHANG J,et al.Autogen:Enabling next-gen llm applications via multi-agent conversation[J].arXiv:2308.08155,2023.
[25]GILL S S,XU M,OTTAVIANI C,et al.AI for next generation computing:Emerging trends and future directions[J].Internet of Things,2022,19:100514.
[26]MORITZ P,NISHIHARA R,WANG S,et al.Ray:A Distributed Framework for Emerging AI Applications[C]//Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation.Carlsbad:USENIX Association,2018:561-577.
[27]HONG S,ZHENG X,CHEN J,et al.Metagpt:Meta programming for multi-agent collaborative framework[J].arXiv:2308.00352,2023.
[28]RADFORD A,NARASIMHAN K,SALIMANS T,et al.Improving language understanding by generative pre-training[EB/OL].https://www.mikecaptain.com/resources/pdf/GPT-1.pdf.
[29]RADFORD A,WU J,CHILD R,et al.Language models are unsupervised multitask learners[J].OpenAI Blog,2019,1(8):9.
[30]CHOWDHERY A,NARANG S,DEVLIN J,et al.Palm:Scaling language modeling with pathways[J].Journal of Machine Learning Research,2023,24(240):1-113.
[31]TOUVRON H,LAVRIL T,IZACARD G,et al.Llama:Openand efficient foundation language models[J].arXiv:2302.13971,2023.
[32]GUO D,YANG D,ZHANG H,et al.Deepseek-r1:Incentivizing reasoning capability in llms via reinforcement learning[J].ar-Xiv:2501.12948,2025.
[33]GE Y,REN Y,HUA W,et al.LLM as OS,agents as apps:Envisioning AIOS,agents and the AIOS-agent ecosystem[J].arXiv:2312.03815,2023.
[34]MEI K,ZHU X,XU W,et al.Aios:Llm agent operating system[J].arXiv:2403.16971,2024.
[35]JIA S,WANG X,SONG M,et al.Agent Centric Operating System-a Comprehensive Review and Outlook for Operating System[J].arXiv:2411.17710,2024.
[36]SONG Z,LI Y,FANG M,et al.Mmac-copilot:Multi-modalagent collaboration operating system copilot[J].arXiv:2404.18074,2024.
[37]ZHUO Z,LI R,LIU K,et al.Kaos:Large model multi-agentoperating system[C]//China Conference on Knowledge Graph and Semantic Computing.Singapore:Springer,2024:347-359.
[38]AGASHE S,HAN J,GAN S,et al.Agent s:An open agentic framework that uses computers like a human[J].arXiv:2410.08164,2024.
[39]WEI J,WANG X,SCHUURMANS D,et al.Chain-of-thoughtprompting elicits reasoning in large language models[J].Advances in Neural Information Processing Systems,2022,35:24824-24837.
[40]YAO S,ZHAO J,YU D,et al.React:Synergizing reasoning and acting in language models[C]//Proceedings of the International Conference on Learning Representations(ICLR).2023.
[41]SUI Y,CHUANG Y N,WANG G,et al.Stop overthinking:A survey on efficient reasoning for large language models[J].ar-Xiv:2503.16419,2025.
[42]ZHANG J,ZHU Y,SUN M,et al.Lightthinker:Thinking step-by-step compression[J].arXiv:2502.15589,2025.
[43]LIAO J,XU J,HE S,et al.AutoForma:A Large LanguageModel-Based Multi-Agent for Computer-Automated Design[C]//Proceedings of the 2024 IEEE International Conference on Systems,Man,and Cybernetics(SMC).IEEE,2024.
[44]WALTERS S,GAO S,NERD S,et al.Eliza:A Web3 friendly AI Agent Operating System[J].arXiv:2501.06781,2025.
[45]LA CAVA L,TAGARELLI A.Open models,closed minds? on agents capabilities in mimicking human personalities through open large language models[C]//Proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence.2025.
[46]CHAN C M,CHEN W,SU Y,et al.Chateval:Towards better llm-based evaluators through multi-agent debate [J].arXiv:2308.07201,2023.
[47]CHEN M,TWOREK J,JUN H,et al.Evaluating large language models trained on code[J].arXiv:2107.03374,2021.
[48]WANG X,WANG Z,LIU J,et al.Mint:Evaluating llms inmulti-turn interaction with tools and language feedback[J].arXiv:2309.10691,2023.
[49]MIALON G,FOURRIER C,WOLF T,et al.Gaia:a benchmark for general ai assistants[C]//Proceedings of the The Twelfth International Conference on Learning Representations.2023.
[50]JIMENEZ C E,YANG J,WETTIG A,et al.Swe-bench:Canlanguage models resolve real-world github issues?[J].arXiv:2310.06770,2023.
[51]XIE T,ZHANG D,CHEN J,et al.Osworld:Benchmarking multimodal agents for open-ended tasks in real computer environments[J].Advances in Neural Information Processing Systems,2024,37:52040-52094.
[52]BONATTI R,ZHAO D,BONACCI F,et al.Windows agent arena:Evaluating multi-modal os agents at scale[J].arXiv:2409.08264,2024.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!