Computer Science ›› 2025, Vol. 52 ›› Issue (11A): 250200095-7.doi: 10.11896/jsjkx.250200095

• Artificial Intelligence • Previous Articles     Next Articles

Research on Multi-agent Joint Navigation Strategy Based on Improved Deep ReinforcementLearning

XIA Weihao1, WANG Jinlong2   

  1. 1 State-owned Changhong Machinery Factory,Guilin,Guangxi 541003,China
    2 School of Information Science and Engineering,Harbin Institute of Technology,Weihai,Shandong 264209,China
  • Online:2025-11-15 Published:2025-11-10
  • Supported by:
    Major Science and Technology Innovation Project of Shandong Province(2021ZLGX05)and Key Support Project of the National Natural Science Foundation of China(Joint Fund)(U23A20336).

Abstract: Driven by the rapid progress of artificial intelligence technology,multi-agent systems have shown their potential for cooperative navigation in many practical applications,such as environmental monitoring,disaster relief,and autonomous driving.These tasks can generally be summarized as the multi-agent cooperative navigation problem.However,with the increase of the number of agents involved in the task,the expansion of reinforcement learning in multi-agent systems faces problems such as inefficiency and learning inertia,which seriously restrict the performance of task execution.This paper proposes an innovative multi-agent reinforcement learning framework.The framework speeds up the learning process by building a two-tier strategy network that enables agents to consider their peers’ strategies in a partially observable environment.In addition,a dynamic reward mechanism is introduced to solve the problem of poor cooperative navigation.The experimental results show that this deep reinforcement learning model based on two-layer strategy network can significantly improve the cooperation efficiency in multi-agent cooperative navigation tasks,especially in the case of a large number of agents,its advantages are more obvious.

Key words: Multi-agent, Joint navigation, Deep reinforcement learning

CLC Number: 

  • V324.1
[1]ZHAO Y N.Research on Path Planning Problem Based on Reinforcement Learning [D].Harbin:Harbin Institute of Technology,2018.
[2]DENG W.Research and Application of Agent Obstacle Avoidance and Path Planning Based on Deep Reinforcement Learning [D].University of Electronic Science and Technology of China,2020.
[3]LI G,CAI C,CHEN Y,et al.Is Q-learning minimax optimal? a tight sample complexity analysis[J].Operations Research,2024,72(1):222-236.
[4]ZHANG L,ZHOU W,XIA J,et al.DQN-based mobile edgecomputing for smart Internet of vehicle[J].EURASIP Journal on Advances in Signal Processing,2022,2022(1):45.
[5]BRIM A.Deep reinforcement learning pairs trading with a double deep Q-network[C]//2020 10th Annual Computing and Communication Workshop and Conference(CCWC).IEEE,2020:0222-0227.
[6]XU Y H,YANG C C,HUA M,et al.Deep deterministic policy gradient(DDPG)-based resource allocation scheme for NOMA vehicular communications[J].IEEE Access,2020,8:18797-18807.
[7]LOWE R,WU Y I,TAMAR A,et al.Multi-agent actor-critic for mixed cooperative-competitive environments[J].Advances in Neural Information Processing Systems,2017,30.
[8]FOERSTER J,FARQUHAR G,AFOURAS T,et al.Counter-factual multi-agent policy gradients[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018,32(1).
[9]JING Y,GUO B,LIN,et al.Scalable order dispatching through Federated Multi-Agent Deep Reinforcement Learning[J].Expert Systems with Applications,2025,264:125792.
[10]JU T,LI L,LIU S,et al.A multi-UAV assisted task offloading and path optimization for mobile edge computing via muti-agent deep reinforcement learning[J].Journal of Network and Computer Applications,2024:103919.
[11]YING C,CHOW A H F,YAN Y,et al.Adaptive rescheduling of rail transit services with short-turnings under disruptions via a multi-agent deep reinforcement learning approach[J].Transportation Research Part B:Methodological,2024,188:103067.
[12]MAK S,XU L,PEARCE T,et al.Fair collaborative vehicle routing:A deep multi-agent reinforcement learning approach[J].Transportation Research Part C:Emerging Technologies,2023,157:104376.
[13]LI S,WU Y,CUI X,et al.Robust multi-agent reinforcement learning via minimax deepdeterministic policy gradient[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:4213-4220.
[14]WANG Y,ZOU S.Policy gradient method for robust reinforcement learning[C]//International Conference on Machine Learning.PMLR,2022:23484-23526.
[1] ZHU Shihao, PENG Kexing, MA Tinghuai. Graph Attention-based Grouped Multi-agent Reinforcement Learning Method [J]. Computer Science, 2025, 52(9): 330-336.
[2] CHEN Jintao, LIN Bing, LIN Song, CHEN Jing, CHEN Xing. Dynamic Pricing and Energy Scheduling Strategy for Photovoltaic Storage Charging Stations Based on Multi-agent Deep Reinforcement Learning [J]. Computer Science, 2025, 52(9): 337-345.
[3] ZHANG Yongliang, LI Ziwen, XU Jiahao, JIANG Yuchen, CUI Ying. Congestion-aware and Cached Communication for Multi-agent Pathfinding [J]. Computer Science, 2025, 52(8): 317-325.
[4] SHI Xiaoyan, YUAN Peiyan, ZHANG Junna, HUANG Ting, GONG Yuejiao. Lifelong Multi-agent Task Allocation Based on Graph Coloring Hybrid Evolutionary Algorithm [J]. Computer Science, 2025, 52(7): 262-270.
[5] HUO Dan, YU Fuping, SHEN Di, HAN Xueyan. Research on Multi-machine Conflict Resolution Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(7): 271-278.
[6] PIAO Mingjie, ZHANG Dongdong, LU Hu, LI Rupeng, GE Xiaoli. Study on Multi-agent Supply Chain Inventory Management Method Based on Improved Transformer [J]. Computer Science, 2025, 52(6A): 240500054-10.
[7] WU Zongming, CAO Jijun, TANG Qiang. Online Parallel SDN Routing Optimization Algorithm Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(6A): 240900018-9.
[8] WANG Chenyuan, ZHANG Yanmei, YUAN Guan. Class Integration Test Order Generation Approach Fused with Deep Reinforcement Learning andGraph Convolutional Neural Network [J]. Computer Science, 2025, 52(6): 58-65.
[9] ZHAO Xuejian, YE Hao, LI Hao, SUN Zhixin. Multi-AGV Path Planning Algorithm Based on Improved DDPG [J]. Computer Science, 2025, 52(6): 306-315.
[10] LI Yuanbo, HU Hongchao, YANG Xiaohan, GUO Wei, LIU Wenyan. Intrusion Tolerance Scheduling Algorithm for Microservice Workflow Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(5): 375-383.
[11] ZHANG Mengxi, HAN Jianjun, XIAO Yan. Dynamic Conflict-Prediction Based Algorithm for Multi-agent Path Finding [J]. Computer Science, 2025, 52(4): 21-32.
[12] ZHENG Longhai, XIAO Bohuai, YAO Zewei, CHEN Xing, MO Yuchang. Graph Reinforcement Learning Based Multi-edge Cooperative Load Balancing Method [J]. Computer Science, 2025, 52(3): 338-348.
[13] DU Likuan, LIU Chen, WANG Junlu, SONG Baoyan. Self-learning Star Chain Space Adaptive Allocation Method [J]. Computer Science, 2025, 52(3): 359-365.
[14] HUO Xingpeng, SHA Letian, LIU Jianwen, WU Shang, SU Ziyue. Windows Domain Penetration Testing Attack Path Generation Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(3): 400-406.
[15] CAI Yuliang, LYU Chunhui, HE Qiang, YU Bo, CHEN Dongyue, WANG Youtong, WANG Qiang, LIU Yuxuan, ZHAO Jingjing. Fully Distributed Event Driven Bipartite Consensus Algorithm Based on Reinforcement Learning [J]. Computer Science, 2025, 52(2): 279-290.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!