基于拥塞感知和缓存通信的多智能体路径规划

doi:10.11896/jsjkx.240900012

Abstract

Abstract: Multi-agent Path Finding(MAPF) is an essential component of large-scale robotic systems.Traditional planners based on conflict search are limited in scalability due to computation time,whereas multi-agent reinforcement learning strategies based on communication mechanisms significantly alleviate this issue.As task complexity and scale increase,how to effectively communicate and avoid congestion becomes significant obstacles for learning-based methods.To overcome these challenges,this paper proposes a decentralized planning method called Congestion-Aware and Cached Communication for Multi-agent Pathfinding(C3MAP),which features cache-based communication and congestion-awareness capabilities.Specifically,agents broadcast communications to neighbors only when the current environmental information significantly differs from the previous communication or when receiving request signals from other agents.Additionally,congestion information is incorporated as locally observable information to guide agents in avoiding congested areas.Experimental results on benchmarks indicate that C3MAP achieves a solution success rate of over 90% in structured scenarios,significantly outperforming existing learning-based methods.Additionally,experiments in large-scale environments confirm the greater stability of the caching communication mechanism and the effectiveness of the congestion awareness strategy.

Key words: Multi-agent system, Path planning, Deep reinforcement learning, Congestion aware, Cached communication

CLC Number:

TP391

ZHANG Yongliang, LI Ziwen, XU Jiahao, JIANG Yuchen, CUI Ying. Congestion-aware and Cached Communication for Multi-agent Pathfinding[J].Computer Science, 2025, 52(8): 317-325.

References

[1]WANG Z H,TONG X R.Research Progress of Multi-agentPath Finding Based on Conflict-based Search Algorithms[J].Computer Science,2023,50(6):358-368.
[2]XIN Y X,HUA D Y,ZHANG L.Multi-agent Reinforcement Learning Algorithm Based on AI Planning[J].Computer Science,2024,51(5):179-192.
[3]STERN R,STURTEVANT N,FELNER A,et al.Multi-agent pathfinding:Definitions,variants,and benchmarks[C]//Proceedings of the International Symposium on Combinatorial Search.AAAI,2019:151-158.
[4]BANFI J,BASILICO N,AMIGONI F.Intractability of time-optimal multirobot path planning on 2D grid graphs with holes[J].IEEE Robotics and Automation Letters,2017,2(4):1941-1947.
[5]SHARON G,STERN R,FELNER A,et al.Conflict-basedsearch for optimal multi-agent pathfinding[J].Artificial Intelligence,2015,219:40-66.
[6]BARERM,SHARON G,STERN R,et al.Suboptimal variants of the conflict-based search algorithm for the multi-agent pathfinding problem[C]//Proceedings of the International Sympo-sium on Combinatorial Search.AAAI,2014:19-27.
[7]SHI D X,PENG Y X,YANG H H,et al.DQN-based Multi-agent Motion Planning Method with Deep Reinforcement Lear-ning[J].Computer Science,2024,51(2):268-277.
[8]SARTORETTI G,KERR J,SHI Y,et al.Primal:Pathfindingvia reinforcement and imitation multi-agent learning[J].IEEE Robotics and Automation Letters,2019,4(3):2378-2385.
[9]CHUNG J,FAYY AD J,YOUNES Y A,et al.Learning team-based navigation:a review of de-ep reinforcement learning techniques for multi-agent pathfinding[J].Artificial Intelligence Review,2024,57(2):41.
[10]MA Z,LUO Y,PAN J.Learning selective communication formulti-agent path finding[J].IEEE Robotics and Automation Letters,2021,7(2):1455-1462.
[11]ZHANG S Q,ZHANG Q,LIN J.Succinct and robust multi-agent communication with temporal message control[J].Advances in Neural Information Processing Systems,2020,33:17271-17282.
[12]LI W,CHEN H,JIN B,et al.Multi-agent path finding with prioritized communication learning[C]//2022 International Conference on Robotics and Automation(ICRA).IEEE,2022:10695-10701.
[13]CHEN L,WANG Y,MIAO Z,et al.Transformer-based imitative reinforcement learning for multirobot path planning[J].IEEE Transactions on Industrial Informatics,2023,19(10):10233-10243.
[14]MA Z,LUO Y,MA H.Distributed heuristic multi-agent path finding with communication [C]//2021 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2021:8699-8705.
[15]WANG Y,XIANG B,HUANG S,et al.SCRIMP:Scalable communication for reinforcement-and imitation-learning-based multi-agent pathfinding[C]//2023 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).IEEE,2023:9301-9308.
[16]SONG Z,ZHANG R,CHENG X.HELSA:Hierarchical Rein-forcement Learning with Spatiotemporal Abstraction for Large-Scale Multi-Agent Path Finding[C]//2023 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).IEEE,2023:7318-7325.
[17]PHAN T,DRISCOLL J,ROMBERG J,et al.Confidence-Based Curriculum Learning for Multi-Agent Path Finding[J].arXiv:2401.05860,2024.
[18]SKRYNNIK A,ANDREYCHUK A,NESTEROVA M,et al.Learn to Follow:Decentralized Lifelong Multi-Agent Pathfin-ding via Planning and Learning [C]//2024 AAAI Conference on Artificial Intelligence.AAAI,2024:17541-17549.
[19]SHI D X,HU H M,SONG L N,et al.Multi-agent Reinforcement Learning Method Based on Observation Reconstruction[J].Computer Science,2024,51(4):280-290.
[20]KIM D,MOON S,HOSTALLERO D,et al.Learning to schedule communication in multi-agent reinforcement learning[J].arXiv:1902.01554.2019.
[21]ZHANG S Q,ZHANG Q,LIN J.Efficient communication inmulti-agent reinforcement learning via variance based control[C]//Advances in Neural Information Processing Systems(NeurIPS).MIT Press,2019:3230-3239.
[22]DING Z,HUANG T,LU Z.Learning individually inferred communication for multi-agent cooperation[C]//Advances in Neural Information Processing Systems(NeurIPS).MIT Press,2020:22069-22079.
[23]STERN R,STURTEVANT N,FELNER A,et al.Multi-agent pathfinding:Definitions,variants,and benchmarks[C]//Proceedings of the International Symposium on Combinatorial Search(SoCS).AAAI,2019:151-158.
[24]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems(NeurIPS).MIT Press,2017:5998-6008.
[25]TAN M.Multi-agent reinforcement learning:Independent vs.cooperative agents[C]//Proceedings of the Tenth International Conference on Machine Learning.ACM,1993:330-337.
[26]VAN HASSELT H,GUEZ A,SILVER D.Deep reinforcement learning with double q-learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence.AAAI,2016:2094-2100.
[27]WANG Z,SCHAUL T,HESSEL M,et al.Dueling network architectures for deep reinforcement learning[C]//International Conference on Machine Learning.ACM,2016:1995-2003.
[28]HORGAN D,QUAN J,BUDDEN D,et al.Distributed priori-tized experience replay[J].arXiv:1803.00933,2018.
[29]BENGIO Y,LOURADOUR J,COLLOBERT R,et al.Curriculum learning[C]//Proceedings of the 26th Annual International Conference on Machine Learning.ACM,2009:41-48.
[30]LIN Q,MA H.SACHA:Soft actor-critic with heuristic-based attention for partially observable multi-agent path finding[J].IEEE Robotics and Automation Letters,2023,8:5100-5107.

Related Articles 15

[1]	FU Wenhao, GE Liyong, WANG Wen, ZHANG Chun. Multi-UAV Path Planning Algorithm Based on Improved Dueling-DQN [J]. Computer Science, 2025, 52(8): 326-334.
[2]	HUO Dan, YU Fuping, SHEN Di, HAN Xueyan. Research on Multi-machine Conflict Resolution Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(7): 271-278.
[3]	LIU Qingyun, YOU Xiong, ZHANG Xin, ZUO Jiwei, LI Jia. Review of Path Planning Algorithms for Mobile Robots [J]. Computer Science, 2025, 52(6A): 240900074-10.
[4]	YE Mingjun, WANG Shujian. UAV Path Planning Based on Improved Dung Beetle Optimization Algorithm [J]. Computer Science, 2025, 52(6A): 240900136-6.
[5]	WU Zongming, CAO Jijun, TANG Qiang. Online Parallel SDN Routing Optimization Algorithm Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(6A): 240900018-9.
[6]	ZHAO Xuejian, YE Hao, LI Hao, SUN Zhixin. Multi-AGV Path Planning Algorithm Based on Improved DDPG [J]. Computer Science, 2025, 52(6): 306-315.
[7]	WANG Chenyuan, ZHANG Yanmei, YUAN Guan. Class Integration Test Order Generation Approach Fused with Deep Reinforcement Learning andGraph Convolutional Neural Network [J]. Computer Science, 2025, 52(6): 58-65.
[8]	LI Yuanbo, HU Hongchao, YANG Xiaohan, GUO Wei, LIU Wenyan. Intrusion Tolerance Scheduling Algorithm for Microservice Workflow Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(5): 375-383.
[9]	ZHENG Longhai, XIAO Bohuai, YAO Zewei, CHEN Xing, MO Yuchang. Graph Reinforcement Learning Based Multi-edge Cooperative Load Balancing Method [J]. Computer Science, 2025, 52(3): 338-348.
[10]	DU Likuan, LIU Chen, WANG Junlu, SONG Baoyan. Self-learning Star Chain Space Adaptive Allocation Method [J]. Computer Science, 2025, 52(3): 359-365.
[11]	HUO Xingpeng, SHA Letian, LIU Jianwen, WU Shang, SU Ziyue. Windows Domain Penetration Testing Attack Path Generation Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(3): 400-406.
[12]	CAI Yuliang, LYU Chunhui, HE Qiang, YU Bo, CHEN Dongyue, WANG Youtong, WANG Qiang, LIU Yuxuan, ZHAO Jingjing. Fully Distributed Event Driven Bipartite Consensus Algorithm Based on Reinforcement Learning [J]. Computer Science, 2025, 52(2): 279-290.
[13]	XU Donghong, LI Bin, QI Yong. Task Scheduling Strategy Based on Improved A2C Algorithm for Cloud Data Center [J]. Computer Science, 2025, 52(2): 310-322.
[14]	WANG Tianjiu, LIU Quan, WU Lan. Offline Reinforcement Learning Algorithm for Conservative Q-learning Based on Uncertainty Weight [J]. Computer Science, 2024, 51(9): 265-272.
[15]	LIU Yi, QI Jie. IRRT^*-APF Path Planning Algorithm Considering Kinematic Constraints of Unmanned Surface Vehicle [J]. Computer Science, 2024, 51(9): 290-298.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Congestion-aware and Cached Communication for Multi-agent Pathfinding

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0