计算机科学 ›› 2023, Vol. 50 ›› Issue (6A): 220700031-7.doi: 10.11896/jsjkx.220700031

• 人工智能 • 上一篇    下一篇

基于MADDPG的无人机群空中拦截作战决策研究

蔺向阳1, 邢清华2, 邢怀玺2   

  1. 1 中国人民解放军军事科学院 北京 100091;
    2 空军工程大学防空反导学院 西安 710051
  • 出版日期:2023-06-10 发布日期:2023-06-12
  • 通讯作者: 蔺向阳(95014052@qq.com)
  • 基金资助:
    国家自然科学基金(71771216,72071209,72001214)

Study on Intelligent Decision Making of Aerial Interception Combat of UAV Group Based onMADDPG

LIN Xiangyang1, XING Qinghua2, XING Huaixi2   

  1. 1 Academy of Military Sciences,Beijing 100091,China;
    2 Air Defense and Anti-Missile College,Air Force Engineering University,Xi’an 710051,China
  • Online:2023-06-10 Published:2023-06-12
  • About author:LIN Xiangyang,born in 1994,Ph.D candidate.His main research interests include military systems operations research optimization and reinforcement learning.
  • Supported by:
    National Natural Science Foundation of China(71771216,72071209,72001214).

摘要: 基于未来现代化作战需求,构建作战想定,研究在此想定条件下,使用强化学习解决关于红蓝双方无人机编队空中拦截任务的多目标智能决策问题。根据作战模式和应用需求,选择多智能体确定性梯度算法,并对算法原理进行简要介绍;按照想定,编程搭建了完备的模拟作战训练平台;设计智能体网络模型、网络参数和训练方法;经过训练,初步达到预期效果。实验证明了所选用算法能够有效地解决该类问题,不仅为该类问题的现实应用提供了技术支撑,也为更复杂作战场景和作战任务条件下智能决策的研究提供了理论基础和实验参考。

关键词: MADDPG, 无人机群, 智能决策, 空中拦截作战, 多智能体强化学习

Abstract: Based on the requirements of future modern operations,a combat scenario is built.Under this scenario,reinforcement learning is used to solve the multi-target intelligent decision-making problem about aerial interception mission of UAVs.The multi-agent reinforcement learning algorithm is selected according to the operational mode and application requirements,and the algorithm principle and process are briefly introduced.The simulated combat system is developed.Design network model,network parameters and training methods.After training,the expected results have been achieved.The effectiveness of the experiment is proved,which not only provides technical support for practical application of this kind of problem,but also provides theoretical basis and experimental reference for the study of intelligent decision making in more complex combat scenarios and combat mission conditions.

Key words: MADDPG, UAV group, Intelligent decision, Air interception combat, Multi-agent reinforcement learning

中图分类号: 

  • TP181
[1]ZHENG Q,WU H,LIANG R P,et al.Intelligent warfare and itsdemand for intelligent command and control technology[J].Fire Control & Command Control,2022,47(2):1-6.
[2]JIANG G S,HAN Z Q,WANG F.Analysis of the application status and development prospect of foreign military artificial intelligence[C]//Unmanned Systems Summit Forum 2021(USS 2021).2021.
[3]SUN Q,ZHANG B C.Russia:Prioritizing AI research and development[J].Prosecutorial View,2021(24):56-57.
[4]FU X,YE Y K,ZHANG P,et al.Research on Characteristics of air Combat capability for military intelligence[J].Winged Missiles Journal,2021(9):73-79.
[5]WANG C,LI S,JIANG H B,et al.Study on intelligent battlefield Situation Estimation of air defense and missile defense[J].Fire Control & Command Control,2020,45(3):7-13.
[6]LIN X Y,XING Q H,LIU F X.Research on Optimization of Combat Force for Key Air Defense Model[J].Systems Engineering and Electronics,2022,44(3):921-928.
[7]FANG X,ZENG B,SONG X X,et al.Warship air threat behavior modeling based on deep reinforcement learning[J].Modern Defense Technology,2020,48(5):59-66.
[8]LIU Y J,ZHANG Y.Intelligent Thinking of Naval Gun Weapons[J].Armory Automation,2022,41(3):21-24.
[9]LI G Y,KUANG S Y,JIANG G,et al.Discussion on the development path of intelligent Electronic warfare equipment[J].Journal of China Academy of Electronics.2022,17(1):7-11.
[10]HUANG W,HE X Z,WANG B X.Application of artificial intelligence technology in Army electronic countermeasures equipment[J].Defense Science and Technology,2022,43(1):26-31.
[11]ZHAO W,YE J,WANG B.Intelligent Command Decision and control based on artificial intelligence[J].Information Security and Communication Confidentiality,2022(2):2-8.
[12]DING Z L,LIU G L,XIE Y,et al.Dynamic target assignment algorithm based on reinforcement learning and Neural network[J].Electronic Design Engineering,2020,28(13):54-60.
[13]DONG K S,HU W B,SHEN Y M,et al.Intelligent Development of Unmanned Aerial Combat Equipment of American Army and its Enlightenment[J].Modern Defense Technology,2022,50(4):28-37.
[14]ZHENG K Y.Research on UAV Track Planning Algorithm Based on Intelligent Cognition[D].Harbin:Harbin Engineering University,2021.
[15]LI B,YUE K Q,GAN Z G,et al.Multi-uav cooperative mission Decision making based on MADDPG[J].Journal of Astronautics,2021,42(6):757-765.
[16]GALAN J,CARRASCO R,LATORRE A.Military Applications of Machine Learning:A Bibliometric Perspective[J].Mathematics,2022,10(9):1397.
[17]SHARMA P,SARMA K K,MASTORAKIS N E.Artificial Intelligence Aided Electronic Warfare Systems-Recent Trends and Evolving Applications[J].IEEE Access,2020,8(99):1.
[18]LEI L.Automatic driving technology using artificial intelligence[J].Agro Food Industry Hi Tech,2017,28(1):570-574.
[19]HODICK J,PROCHÁZKA D,BAXA F,et al.Computer Assisted Wargame for Military Capability-Based Planning[J].Entropy,2020,22(8):861.
[20]LOWE R,WU Y,TAMAR A,et al.Multi-Agent Actor-Critic forMixed Cooperative-Competitive Environments[J].arXiv:1706.02275,2017.
[21]LILLICRAP T P,HUNT J J,PRITZEL A,et al.Continuous control with deep reinforcement learning[J].arXiv:1509.02971,2015.
[22]DEGRIS T,WHITE M,SUTTON R S.Off-Policy Actor-Critic[J].arXiv:1205.4839,2012.
[23]SUTTON R S,MCALLESTER D,SINGH S,et al.Policy Gradient Methods for Reinforcement Learning with Function Approximation[J].Submitted to Advances in Neural Information Processing Systems,1999,12.
[24]VOLODYMYR M,KORAY K,DAVID S,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529-533.
[25]LIN X,XING Q,LIU F.Choice of discount rate in reinforcementlearning with long-delay rewards[J].Systems Engineering and Electronics,2022,33(2):12.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!