改进深度强化学习的多智能体联合导航策略研究

doi:10.11896/jsjkx.250200095

Abstract

Abstract: Driven by the rapid progress of artificial intelligence technology,multi-agent systems have shown their potential for cooperative navigation in many practical applications,such as environmental monitoring,disaster relief,and autonomous driving.These tasks can generally be summarized as the multi-agent cooperative navigation problem.However,with the increase of the number of agents involved in the task,the expansion of reinforcement learning in multi-agent systems faces problems such as inefficiency and learning inertia,which seriously restrict the performance of task execution.This paper proposes an innovative multi-agent reinforcement learning framework.The framework speeds up the learning process by building a two-tier strategy network that enables agents to consider their peers’ strategies in a partially observable environment.In addition,a dynamic reward mechanism is introduced to solve the problem of poor cooperative navigation.The experimental results show that this deep reinforcement learning model based on two-layer strategy network can significantly improve the cooperation efficiency in multi-agent cooperative navigation tasks,especially in the case of a large number of agents,its advantages are more obvious.

Key words: Multi-agent, Joint navigation, Deep reinforcement learning

CLC Number:

V324.1

XIA Weihao, WANG Jinlong. Research on Multi-agent Joint Navigation Strategy Based on Improved Deep ReinforcementLearning[J].Computer Science, 2025, 52(11A): 250200095-7.

References

[1]ZHAO Y N.Research on Path Planning Problem Based on Reinforcement Learning [D].Harbin:Harbin Institute of Technology,2018.
[2]DENG W.Research and Application of Agent Obstacle Avoidance and Path Planning Based on Deep Reinforcement Learning [D].University of Electronic Science and Technology of China,2020.
[3]LI G,CAI C,CHEN Y,et al.Is Q-learning minimax optimal? a tight sample complexity analysis[J].Operations Research,2024,72(1):222-236.
[4]ZHANG L,ZHOU W,XIA J,et al.DQN-based mobile edgecomputing for smart Internet of vehicle[J].EURASIP Journal on Advances in Signal Processing,2022,2022(1):45.
[5]BRIM A.Deep reinforcement learning pairs trading with a double deep Q-network[C]//2020 10th Annual Computing and Communication Workshop and Conference(CCWC).IEEE,2020:0222-0227.
[6]XU Y H,YANG C C,HUA M,et al.Deep deterministic policy gradient(DDPG)-based resource allocation scheme for NOMA vehicular communications[J].IEEE Access,2020,8:18797-18807.
[7]LOWE R,WU Y I,TAMAR A,et al.Multi-agent actor-critic for mixed cooperative-competitive environments[J].Advances in Neural Information Processing Systems,2017,30.
[8]FOERSTER J,FARQUHAR G,AFOURAS T,et al.Counter-factual multi-agent policy gradients[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018,32(1).
[9]JING Y,GUO B,LIN,et al.Scalable order dispatching through Federated Multi-Agent Deep Reinforcement Learning[J].Expert Systems with Applications,2025,264:125792.
[10]JU T,LI L,LIU S,et al.A multi-UAV assisted task offloading and path optimization for mobile edge computing via muti-agent deep reinforcement learning[J].Journal of Network and Computer Applications,2024:103919.
[11]YING C,CHOW A H F,YAN Y,et al.Adaptive rescheduling of rail transit services with short-turnings under disruptions via a multi-agent deep reinforcement learning approach[J].Transportation Research Part B:Methodological,2024,188:103067.
[12]MAK S,XU L,PEARCE T,et al.Fair collaborative vehicle routing:A deep multi-agent reinforcement learning approach[J].Transportation Research Part C:Emerging Technologies,2023,157:104376.
[13]LI S,WU Y,CUI X,et al.Robust multi-agent reinforcement learning via minimax deepdeterministic policy gradient[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:4213-4220.
[14]WANG Y,ZOU S.Policy gradient method for robust reinforcement learning[C]//International Conference on Machine Learning.PMLR,2022:23484-23526.

Related Articles 15

[1]	XU Jiawen, ZHENG Yungui, ZHOU Wei, XU Yaoqiang, HU Huiqi, ZHOU Xuan. SQL-MARS:Text-to-SQL Structured Data Recommendation System for Ambiguous UserRequirements [J]. Computer Science, 2026, 53(3): 52-63.
[2]	XIE Guangqiang, QIU Fengyang, LI Yang. Fast Consensus Seeking in Distributed Multi-agent System Using Topology Virtual Structural Hole Node [J]. Computer Science, 2026, 53(2): 358-366.
[3]	LI Fang, YUAN Baochun, SHEN Hang, WANG Tianjing, BAI Guangwei. Deep Reinforcement Learning-based Aircraft Task Offloading in Low Earth Orbit Satellite Networks [J]. Computer Science, 2026, 53(2): 406-415.
[4]	WANG Haoyan, LI Chongshou, LI Tianrui. Reinforcement Learning Method for Solving Flexible Job Shop Scheduling Problem Based onDouble Layer Attention Network [J]. Computer Science, 2026, 53(1): 231-240.
[5]	DUAN Pengting, WEN Chao, WANG Baoping, WANG Zhenni. Collaborative Semantics Fusion for Multi-agent Behavior Decision-making [J]. Computer Science, 2026, 53(1): 252-261.
[6]	ZHU Shihao, PENG Kexing, MA Tinghuai. Graph Attention-based Grouped Multi-agent Reinforcement Learning Method [J]. Computer Science, 2025, 52(9): 330-336.
[7]	CHEN Jintao, LIN Bing, LIN Song, CHEN Jing, CHEN Xing. Dynamic Pricing and Energy Scheduling Strategy for Photovoltaic Storage Charging Stations Based on Multi-agent Deep Reinforcement Learning [J]. Computer Science, 2025, 52(9): 337-345.
[8]	ZHANG Yongliang, LI Ziwen, XU Jiahao, JIANG Yuchen, CUI Ying. Congestion-aware and Cached Communication for Multi-agent Pathfinding [J]. Computer Science, 2025, 52(8): 317-325.
[9]	SHI Xiaoyan, YUAN Peiyan, ZHANG Junna, HUANG Ting, GONG Yuejiao. Lifelong Multi-agent Task Allocation Based on Graph Coloring Hybrid Evolutionary Algorithm [J]. Computer Science, 2025, 52(7): 262-270.
[10]	HUO Dan, YU Fuping, SHEN Di, HAN Xueyan. Research on Multi-machine Conflict Resolution Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(7): 271-278.
[11]	PIAO Mingjie, ZHANG Dongdong, LU Hu, LI Rupeng, GE Xiaoli. Study on Multi-agent Supply Chain Inventory Management Method Based on Improved Transformer [J]. Computer Science, 2025, 52(6A): 240500054-10.
[12]	WU Zongming, CAO Jijun, TANG Qiang. Online Parallel SDN Routing Optimization Algorithm Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(6A): 240900018-9.
[13]	WANG Chenyuan, ZHANG Yanmei, YUAN Guan. Class Integration Test Order Generation Approach Fused with Deep Reinforcement Learning andGraph Convolutional Neural Network [J]. Computer Science, 2025, 52(6): 58-65.
[14]	ZHAO Xuejian, YE Hao, LI Hao, SUN Zhixin. Multi-AGV Path Planning Algorithm Based on Improved DDPG [J]. Computer Science, 2025, 52(6): 306-315.
[15]	LI Yuanbo, HU Hongchao, YANG Xiaohan, GUO Wei, LIU Wenyan. Intrusion Tolerance Scheduling Algorithm for Microservice Workflow Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(5): 375-383.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Research on Multi-agent Joint Navigation Strategy Based on Improved Deep ReinforcementLearning

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0