计算机科学 ›› 2025, Vol. 52 ›› Issue (7): 271-278.doi: 10.11896/jsjkx.240800133

• 人工智能 • 上一篇    下一篇

基于深度强化学习的多机冲突解决方法的研究

霍丹, 余付平, 沈堤, 韩雪艳   

  1. 空军工程大学空管领航学院 西安 710000
  • 收稿日期:2024-08-23 修回日期:2024-12-02 发布日期:2025-07-17
  • 通讯作者: 霍丹(kannyh2022@163.com)
  • 基金资助:
    国家社会科学基金(22BGL319)

Research on Multi-machine Conflict Resolution Based on Deep Reinforcement Learning

HUO Dan, YU Fuping, SHEN Di, HAN Xueyan   

  1. College of Air Traffic Control and Navigation, Air Force Engineering University, Xi'an 710000, China
  • Received:2024-08-23 Revised:2024-12-02 Published:2025-07-17
  • About author:HUO Dan,born in 1990, master,lec-turer .Her main research interests include air traffic control and collision prevention safety.
  • Supported by:
    National Social Science Foundation of China(22BGL319).

摘要: 随着军民航及通航飞行活动增多,用空矛盾突出,在同一空域中多架飞机同时飞行成为一种常态,如何通过技术手段提供辅助防撞决策,避免飞行冲突成为亟待解决的问题。针对航空器在飞行过程中的多机飞行冲突解脱问题,提出了一种基于多智能体深度强化学习,结合图卷积神经网络作为扩展框架的图卷积深度强化学习(GDQN)算法。首先构造消息传递功能,建立多智能体的飞行冲突模型,该模型可以在避免冲突和碰撞的同时,引导多架飞机穿越三维的非结构化空域;其次利用基于图卷积神经网络的深度自学习方法为机场调度提供智能化的冲突规避手段,针对多机飞行冲突场景建立多智能体系统(MAS);最后通过在受控的模拟环境中使用广泛的训练集来训练策略函数,对算法的有效性进行了仿真验证。结果表明,优化后的算法是可行的,用于解决冲突时的成功率可达90%以上,且冲突解决决策的计算时间短于3s,发出的空中交通管制(ATC)指令明显减少,效率得到了明显提升。

关键词: 深度强化学习, 图卷积神经网络, 消息传递, 多智能体模型, 多机飞行, 冲突解脱

Abstract: With the increase in military,civilian,and general aviation flight activities,the conflict over airspace use has become prominent,and it has become a normal phenomenon for multiple aircraft to fly simultaneously in the same airspace.Therefore,it is an urgent problem that needs to be solved how to provide assistance in avoiding flight collisions through technical means.To tackle the challenge of resolving conflicts between multiple aircraft in flight,this paper introduces a Graph Convolutional Deep Reinforcement Learning(GDQN) algorithm.This algorithm combins multi-agent deep reinforcement learning with a graph con-volutional neural network framework.Initially,it constructs a message-passing function to develop a multi-agent flight conflict model,which can navigate multiple aircraft through three-dimensional,unstructures airspace while avoiding conflicts and collisions.Subsequently,it employes a deep self-learning method based on graph convolutional networks to offer intelligent conflict avoidance solutions for airport scheduling,creats a multi-agent system(MAS) for managing multi-aircraft conflict scenarios.The effectiveness of the algorithm is validated through simulations using extensive training datasets in a controlled environment.The results indicate that the optimized algorithm is effective,achieving a conflict resolution success rate of over 90%,with resolution decision times of less than 3 seconds.Additionally,it significantly reduces the number of air traffic control(ATC) commands issued and improves overall operational efficiency.

Key words: Deep reinforcement learning, Graph convolutional neural network, Message passing, Multi-agent model, Multi-aircraft flight, Conflict resolution

中图分类号: 

  • TP389.1
[1]WANG Z,LI H,WANG J,et al.Deep reinforcement learningbasedconflict detection and resolution in air traffic control[J].IET Intelligent Transport Systems,2019,13(6):133-142.
[2]LIU X,XIAO G.Flight Conflict Resolution and Trajectory Recovery Through Mixed Integer Nonlinear Programming Based on Speed and Heading Angle Change[J].Transportation Research Record,2024,2678(4):751-775.
[3]WEN H.Research on Flight Conflict Resolution based on Deep reinforcement Learning[D].Chengdu:Sichuan University,2021:12-23.
[4]CAI M,WAN L J,GAO Z Z,et al.Conflict detection method based on K-Means spatial clustering in grid system[C]//China Association for Science and Technology,Ministry of Transport,Chinese Academy of Engineering,Hubei Provincial People's Government.Proceedings of World Transport Congress 2022(WTC2022).Air traffic Control and Navigation College,Air Force Engineering University,2022:666-673.
[5]TONG L,YANG J,GAN X S,et al.More chaotic ant colony algorithm based on improved machine conflict free simulation[J].Journal of system simulation,2025,5(1):155-166.
[6]BRITTAIN M,WEI P.Autonomous separation assurance in anhigh density en-route sector:A deep multi-agent reinforcement learning approach[C]//Proceedingsof Institute of Electrical and Electronics Engineers(IEEE) Intelligent Transportation Systems Conference.Piscataway,NJ:IEEE Press,2019:3256-3262.
[7]MOHAMMAD E,FARIBORZ H HAGHIIGHAT,et al.Transfer learning for occupancy-based HVAC control:A data-driven approach using unsupervised learning of occupancy profiles and deep reinforcement learning[J].Energy & Buildings,2023,300(9):356-362.
[8]VSWANI A,SHAZEER N.Attention is all you need[C]//The 31st Annual Conference on Neural Information Processing Systems(NeurIPS).Long Beach,CA,2017:5999-6009.
[9]LI S,EGOROV M,KOCHENDERFER J.Optimizing CollisionAvoidance in Dense Airspace using Deep ReinforcementLearning[C]//The 13th USA/Europe Air Traffic Management Research and Development Seminar(ATM2019),Vienna,Austria,2019:45-49.
[10]XIE X F,SMITH S,BARLOW G.Coordinated look-aheadscheduling for real-time traffic signal control[C]//International Joint Conference on Autonomous Agents and Multiagent Systems(AAMAS).Valencia,Spain,2012:1271-1272.
[11]LIU Y P,SUI D,LIN Y D.Research on the learning behavior of Controller Agent based on Q Learning[J].Journal of Harbin University of Commerce(Natural Science Edition),2016,32(6):763-768.
[12]SILVER D,HUANG A.mastering the game of go with deep neural networks and tree search[J].Nature,2016,529(7587):484-489.
[13]BROWN N,SANDHOLM T.Superhuman AI for multiplayerpoker[J].Science,2019,365(6456):885-890.
[14]SUI D,XU W,ZHANG K.Study on the resolution of multi-aircraft flight conflicts based on an IDQN[J].Chinese Journal of Aeronautics,2022,35(2):195-213.
[15]BI K X,WU M G,WEN X X,et al.Conflict resolution strategy based on flight conflict network and genetic algorithm[J].Systems Engineering and Electronics,2023,45(5):1429-1440.
[16]RASHID T,SAMVELYAN M.Monotonic value function factorisation for deep multi-agent reinforcement learning[C]//The 35th International Conference on Machine Learning(ICML).Stockholm,Sweden,2018:1335-1345.
[17]SUKHBAATAR S,SZLAM A,FERGUS R.Learning Multiagent Communication with Backpropagation[C]//The 30th Annual Conference on Neural Information Processing Systems(NeurIPS).Barcelona,Spain,2016:2244-2252.
[18]JIANG J,DUN C.Graph convolutional reinforcement learning[C]//The 8th International Conference on Learning Representations(ICLR).Addis Ababa Ethiopia,2020:1265-1273.
[19]JUNPENG Y,YIYU C.A practical reinforcement learningframework for automatic radar detection[J].ZTE Communications,2023,21(3):22-28.
[20]ZHANG Z,CUI P,ZHU W.Deep learning on graphs:A survey[C]//Proceedings of Institute of Electrical and Electronics Engineers(IEEE)Transactions on Knowledge and Data Enginee-ring.Piscataway,NJ:IEEE Press,2020.
[21]LI S,EGOROV M,KOCHENDERFER J.Optimizing collision avoidance in dense airspace using deep reinforcement learning[C]//The 13th USA/Europe Air Traffic Management Research and Development Seminar(ATM2019).Vienna,Austria,2019:45-49.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!