基于意图的多智能体深度强化学习运动规划方法

doi:10.11896/jsjkx.220900031

Abstract

Abstract: The challenges of multi-agent motion planning lie in the lack of effective cooperative approaches,high communication dependency requirements,and the lack of information screening mechanisms.To this end,an intention-based multi-agent deep reinforcement learning motion planning method is proposed,which can help agents reach goals while avoiding collisions without explicit communication.Firstly,the concept of intention is introduced into the multi-agent motion planning problem by combining the visual images with the history maps to predict the intentions of agents,so that agents can anticipate the actions of other agents and thus collaborate effectively.Secondly,a convolutional neural network architecture based on attention mechanism is designed.This network architecture can be used to predict the intentions of agents and select the actions of agents,filtering the useful visual input information while reducing the reliance on communication for multi-agent cooperation.Thirdly,a value-based deep reinforcement learning algorithm is proposed to learn the motion planning strategy.By improving the objective function and the calculation of the Q values,the strategy is made more stable.Tested in six different PyBullet simulation scenes,the experimental results demonstrate that the proposed method improves the cooperation efficiency of multi-agent teams by an average of 10.74% with significant performance advantages compared to other advanced multi-agent motion planning methods.

Key words: Intention, Attention mechanism, Multi-agent system, Motion planning, Deep reinforcement learning

CLC Number:

TP391

PENG Yingxuan, SHI Dianxi, YANG Huanhuan, HU Haomeng, YANG Shaowu. Intention-based Multi-agent Motion Planning Method with Deep Reinforcement Learning[J].Computer Science, 2023, 50(10): 156-164.

References

[1]GUPTA J K,EGOROV M,KOCHENDERFER M.Cooperative multi-agent control using deep reinforcement learning[C]//International Conference on Autonomous Agents and Multiagent Systems.Cham:Springer,2017:66-83.
[2]BUSONIU L,BABUSKA R,DE SCHUTTER B.Multi-agentreinforcement learning:A survey[C]//2006 9th International Conference on Control,Automation,Robotics and Vision.IEEE,2006:1-6.
[3]HERNANDEZ-LEAL P,KARTAL B,TAYLOR M E.A survey and critique of multiagent deep reinforcement learning[J].Autonomous Agents and Multi-Agent Systems,2019,33(6):750-797.
[4]HOLTE R C,PEREZ M B,ZIMMER R M,et al.HierarchicalA^*:Searching abstraction hierarchies efficiently[C]//AAAI/IAAI,Vol.1.1996:530-535.
[5]DORIGO M,MANIEZZO V,COLORNI A.The ant system:An autocatalytic optimizing process[J].Clustering,1991,3(12):340.
[6]KHATIB O.Real-time obstacle avoidance system for manipula-tors and mobile robots[C]//Proceedings of the 1985 IEEE International Conference on Robotics and Automation.St.Louis,MO,USA,1985:25-28.
[7]PU Z Q,YI J Q,LIU Z,et al.A review of collaborative know-ledge and Data driven swarm intelligent decision making[J].Acta Automatica Sinica,2022,48(3):1-17.
[8]NGUYEN T T,NGUYEN N D,NAHAVANDI S.Deep reinforcement learning for multiagent systems:A review of challenges,solutions,and applications[J].IEEE Transactions on Cybernetics,2020,50(9):3826-3839.
[9]FOERSTER J,ASSAEL I A,DE FREITAS N,et al.Learning to communicate with deep multi-agent reinforcement learning[J].Advances in Neural Information Processing Systems,2016,29:2137-2145.
[10]SARTORETTI G,KERR J,SHI Y,et al.Primal:Pathfindingvia reinforcement and imitation multi-agent learning[J].IEEE Robotics and Automation Letters,2019,4(3):2378-2385.
[11]HAN R,CHEN S,HAO Q.Cooperative multi-robot navigation in dynamic environment with deep reinforcement learning[C]//2020 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2020:448-454.
[12]SINGH A,JAIN T,SUKHBAATAR S.Learning when to communicate at scale in multiagent cooperative and competitive tasks[J].arXiv:1812.09755,2018.
[13]WU J,SUN X,ZENG A,et al.Spatial intention maps for multi-agent mobile manipulation[C]//2021 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2021:8749-8756.
[14]JARADAT M A K,AL-ROUSAN M,QUADAN L.Reinforce-ment based mobile robot navigation in dynamic environment[J].Robotics and Computer-Integrated Manufacturing,2011,27(1):135-149.
[15]DUGULEANA M,MOGAN G.Neural networks based rein-forcement learning for mobile robots obstacle avoidance[J].Expert Systems with Applications,2016,62:104-115.
[16]XIE L,WANG S,MARKHAM A,et al.Towards monocular vision based obstacle avoidance through deep reinforcement lear-ning[J].arXiv:1706.09829,2017.
[17]KAHN G,VILLAFLOR A,DING B,et al.Self-supervised deep reinforcement learning with generalized computation graphs for robot navigation[C]//2018 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2018:5129-5136.
[18]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Playing atari with deep reinforcement learning[J].arXiv:1312.5602,2013.
[19]SUNEHAG P,LEVER G,GRUSLYS A,et al.Value-decompo-sition networks for cooperative multi-agent learning[J].arXiv:1706.05296,2017.
[20]LONG P,FAN T,LIAO X,et al.Towards optimally decentra-lized multi-robot collision avoidance via deep reinforcement lear-ning[C]//2018 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2018:6252-6259.
[21]LIN J,YANG X,ZHENG P,et al.End-to-end decentralizedmulti-robot navigation in unknown complex environments via deep reinforcement learning[C]//2019 IEEE International Conference on Mechatronics and Automation(ICMA).IEEE,2019:2493-2500.
[22]MOHSENI-KABIR A,ISELE D,FUJIMURA K.Interaction-aware multi-agent reinforcement learning for mobile agents with individual goals[C]//2019 International Conference on Robotics and Automation(ICRA).IEEE,2019:3370-3376.
[23]WANG B,LIU Z,LI Q,et al.Mobile robot path planning in dynamic environments through globally guided reinforcement learning[J].IEEE Robotics and Automation Letters,2020,5(4):6932-6939.
[24]LIU Z,CHEN B,ZHOU H,et al.Mapper:Multi-agent pathplanning with evolutionary reinforcement learning in mixed dynamic environments[C]//2020 IEEE/RSJ International Confe-rence on Intelligent Robots and Systems(IROS).IEEE,2020:11748-11754.
[25]WANG D,DENG H.Multirobot coordination with deep rein-forcement learning in complex environments[J].Expert Systems with Applications,2021,180:115128.
[26]LOWE R,WU Y I,TAMAR A,et al.Multi-agent actor-critic for mixed cooperative-competitive environments[J].Advances in Neural Information Processing Systems,2017(30):6379-6390.
[27]VINYALS O,BABUSCHKIN I,CZARNECKI W M,et al.Grandmaster level in StarCraft II using multi-agent reinforcement learning[J].Nature,2019,575(7782):350-354.
[28]FOERSTER J,NARDELLI N,FARQUHAR G,et al.Stabili-sing experience replay for deep multi-agent reinforcement lear-ning[C]//International Conference on Machine Learning.PMLR,2017:1146-1155.
[29]QI S,ZHU S C.Intent-aware multi-agent reinforcement learning[C]//2018 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2018:7533-7540.
[30]FU J,LI W,DU J,et al.A multiscale residual pyramid attention network for medical image fusion[J].Biomedical Signal Proces-sing and Control,2021,66:102488.
[31]ZHAI Y,DING B,LIU X,et al.Decentralized multi-robot colli-sion avoidance in complex scenarios with selective communication[J].IEEE Robotics and Automation Letters,2021,6(4):8379-8386.
[32]WANG X,LIAN L,YU S X.Unsupervised visual attention and invariance for reinforcement learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:6677-6687.
[33]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[34]WOO S,PARK J,LEE J Y,et al.CBAM:Convolutional blockattention module[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:3-19.
[35]VAN HASSELT H,GUEZ A,SILVER D.Deep reinforcement learning with double q-learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2016.

Related Articles 15

[1]	LIU Xingguang, ZHOU Li, ZHANG Xiaoying, CHEN Haitao, ZHAO Haitao, WEI Jibo. Edge Intelligent Sensing Based UAV Space Trajectory Planning Method [J]. Computer Science, 2023, 50(9): 311-317.
[2]	LI Ke, YANG Ling, ZHAO Yanbo, CHEN Yonglong, LUO Shouxi. EGCN-CeDML:A Distributed Machine Learning Framework for Vehicle Driving Behavior Prediction [J]. Computer Science, 2023, 50(9): 318-330.
[3]	LIN Xinyu, YAO Zewei, HU Shengxi, CHEN Zheyi, CHEN Xing. Task Offloading Algorithm Based on Federated Deep Reinforcement Learning for Internet of Vehicles [J]. Computer Science, 2023, 50(9): 347-356.
[4]	YI Liu, GENG Xinyu, BAI Jing. Hierarchical Multi-label Text Classification Algorithm Based on Parallel Convolutional Network Information Fusion [J]. Computer Science, 2023, 50(9): 278-286.
[5]	LUO Yuanyuan, YANG Chunming, LI Bo, ZHANG Hui, ZHAO Xujian. Chinese Medical Named Entity Recognition Method Incorporating Machine ReadingComprehension [J]. Computer Science, 2023, 50(9): 287-294.
[6]	ZHANG Yian, YANG Ying, REN Gang, WANG Gang. Study on Multimodal Online Reviews Helpfulness Prediction Based on Attention Mechanism [J]. Computer Science, 2023, 50(8): 37-44.
[7]	JIN Tiancheng, DOU Liang, ZHANG Wei, XIAO Chunyun, LIU Feng, ZHOU Aimin. OJ Exercise Recommendation Model Based on Deep Reinforcement Learning and Program Analysis [J]. Computer Science, 2023, 50(8): 58-67.
[8]	TENG Sihang, WANG Lie, LI Ya. Non-autoregressive Transformer Chinese Speech Recognition Incorporating Pronunciation- Character Representation Conversion [J]. Computer Science, 2023, 50(8): 111-117.
[9]	WANG Jiahao, ZHONG Xin, LI Wenxiong, ZHAO Dexin. Human Activity Recognition with Meta-learning and Attention [J]. Computer Science, 2023, 50(8): 193-201.
[10]	XIONG Liqin, CAO Lei, CHEN Xiliang, LAI Jun. Value Factorization Method Based on State Estimation [J]. Computer Science, 2023, 50(8): 202-208.
[11]	WANG Yu, WANG Zuchao, PAN Rui. Survey of DGA Domain Name Detection Based on Character Feature [J]. Computer Science, 2023, 50(8): 251-259.
[12]	YAN Mingqiang, YU Pengfei, LI Haiyan, LI Hongsong. Arbitrary Image Style Transfer with Consistent Semantic Style [J]. Computer Science, 2023, 50(7): 129-136.
[13]	GAO Xiang, TANG Jiqiang, ZHU Junwu, LIANG Mingxuan, LI Yang. Study on Named Entity Recognition Method Based on Knowledge Graph Enhancement [J]. Computer Science, 2023, 50(6A): 220700153-6.
[14]	ZHANG Tao, CHENG Yifei, SUN Xinxu. Graph Attention Networks Based on Causal Inference [J]. Computer Science, 2023, 50(6A): 220600230-9.
[15]	CUI Lin, CUI Chenlu, LIU Zhengwei, XUE Kai. Speech Emotion Recognition Based on Improved MFCC and Parallel Hybrid Model [J]. Computer Science, 2023, 50(6A): 220800211-7.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Intention-based Multi-agent Motion Planning Method with Deep Reinforcement Learning

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0