计算机科学 ›› 2025, Vol. 52 ›› Issue (1): 80-86.doi: 10.11896/jsjkx.240900075
颜玉松1, 周圆2, 王琮2, 孔圣麒1, 王权2, 黎敏讷2, 王之元2
YAN Yusong1, ZHOU Yuan2, WANG Cong2, KONG Shengqi1, WANG Quan2, LI Minne2, WANG Zhiyuan2
摘要: 围绕生成式人工智能赋能指挥决策需求,分析了指挥决策中方案生成问题的难点挑战和新兴预训练大语言模型技术的应用前景,提出了一种基于预训练大模型的作战行动方案生成方法——COA-Gen。首先,为了使生成的行动方案符合目标,设计了多轮方案生成框架;其次,构建了多要素中文提示词模板用于整合海量多源信息;最后,针对特定小领域的数据缺乏问题,引入知识增强技术以提升大模型规划效能。为了验证所提行动方案的效果,制定了基于《星际争霸II》游戏引擎和“虎爪”想定的方案验证环境。实验结果表明,该方法具有较好的鲁棒性,可以较好地依从指挥员意图,验证了大模型用于作战行动方案生成的可行性。此外,不同预训练大模型在相同任务中展现出不同的效果,表明在实际应用中选择不同的预训练大模型可能会生成具有不同风格的行动方案,从而影响最终的行动结果。
中图分类号:
[1]ZHANG Y X.Research on Modeling and Optimization Methods for Military Mission Planning under Uncertainty[D].Changsha:National University of Defense Technology,2014. [2]WAYTOWICH N,HARE J,GOECKS V G,et al.Learning to guide multiple heterogeneous actors from a single human de-monstration via automatic curriculum learning in StarCraft II[C]//Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications IV.SPIE,2022,12113:283-293. [3]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Ge-nerative adversarial networks[J].Communications of the ACM,2020,63(11):139-144. [4]KINGMA D P,WELLING M.Auto-encoding variational bayes[J].arXiv:1312.6114,2013. [5]BROWN T,MANN B,RYDER N,et al.Language models are few-shot learners[J].Advances in Neural Information Proces-sing Systems,2020,33:1877-1901. [6]BAI J,BAI S,CHU Y,et al.Qwen technical report[J].arXiv:2309.16609,2023. [7]TOUVRON H,LAVRIL T,IZACARD G,et al.Llama:Openand efficient foundation language models[J].arXiv:2302.13971,2023. [8]ZENG A,LIU X,DU Z,et al.Glm-130b:An open bilingual pre-trained model[J].arXiv:2210.02414,2022. [9]WEI J,WANG X,SCHUURMANS D,et al.Chain-of-thoughtprompting elicits reasoning in large language models[J].Advances in Neural Information Processing Systems,2022,35:24824-24837. [10]SHINN N,CASSANO F,GOPINATH A,et al.Reflexion:Language agents with verbal reinforcement learning[J].Advances in Neural Information Processing Systems,2024,36:8634-8652. [11]HUANG Y,HUANG J.A Survey on Retrieval-AugmentedText Generation for Large Language Models[J].arXiv:2404.10981,2024. [12]FIKES R E,NILSSON N J.STRIPS:A new approach to the application of theorem proving to problem solving[J].Artificial Intelligence,1971,2:189-208. [13]TATE A,DRABBLE B,DALTON J.The use of condition types to restrict search in an AI planner[C]//Proceedings of the AAAI Conference on Artificial Intelligence.AAAI,1994:1129-1134. [14]SARCIA S A.Organizing Structures and Information for Deve-loping AI-enabled Military Decision-Making Systems[C]//2023 IEEE International Workshop on Technologies for Defense and Security(TechDefense).IEEE,2023:455-460. [15]SCHWARTZ P J,O’NEILL D V,BENTZ M E,et al.AI-enabled wargaming in the military decision making process[C]//Artificial Intelligence And Machine Learning for Multi-Domain Operations Applications II.SPIE,2020,11413:118-134. [16]LUO J Z,SUN Y L,QIAN Z Z,et al.Overview and Prospect of Artificial Intelligence Large Models[J].Radio Engineering,2023,53(11):2461-2472. [17]BAI J,BAI S,YANG S,et al.Qwen-vl:A versatile vision-lan-guage model for understanding,localization,text reading,and beyond[J].arXiv:2308.12966,2023. [18]WANG G,XIE Y,JIANG Y,et al.Voyager:An open-ended embodied agent with large language models[J].arXiv:2305.16291,2023. [19]AHN M,BROHAN A,BROWN N,et al.Do as I can,not as I say:Grounding language in robotic affordances[J].arXiv:2204.01691,2022. [20]LAMPARTH M,CORSO A,GANZ J,et al.Human vs.ma-chine:Language models and wargames[J].arXiv:2403.03407,2024. [21]GOECKS V G,WAYTOWICH N.Coa-gpt:Generative pre-trained transformers for accelerated course of action development in military operations[C]//2024 International Conference on Military Communication and Information Systems.IEEE,2024:1-10. [22]HU S,HUANG T,LIU L.Pok\′eLLMon:A Human-Parity Agent for Pok\′emon Battles with Large Language Models[J].arXiv:2402.01118,2024. [23]MNIH V.Asynchronous Methods for Deep ReinforcementLearning[J].arXiv:1602.01783,2016. |
|