Computer Science ›› 2023, Vol. 50 ›› Issue (10): 156-164.doi: 10.11896/jsjkx.220900031

• Artificial Intelligence • Previous Articles     Next Articles

Intention-based Multi-agent Motion Planning Method with Deep Reinforcement Learning

PENG Yingxuan1, SHI Dianxi1,2,3, YANG Huanhuan1, HU Haomeng1, YANG Shaowu1   

  1. 1 School of Computer Science,National University of Defense Technology,Changsha 410073,China
    2 National Innovation Institute of Defense Technology,Academy of Military Sciences,Beijing 100166,China
    3 Tianjin Artificial Intelligence Innovation Center,Tianjin 300457,China
  • Received:2022-09-05 Revised:2022-12-12 Online:2023-10-10 Published:2023-10-10
  • About author:PENG Yingxuan,born in 1998,postgraduate.Her main research interests include artificial intelligence,multi-agent collaboration,reinforcement lear-ning and machine learning.SHI Dianxi,born in 1966,Ph.D,professor,Ph.D supervisor.His main research interests include distributed object middleware technology,adaptive software technology,artificial intelligence,and robot operating systems.
  • Supported by:
    National Natural Science Foundation of China(91948303).

Abstract: The challenges of multi-agent motion planning lie in the lack of effective cooperative approaches,high communication dependency requirements,and the lack of information screening mechanisms.To this end,an intention-based multi-agent deep reinforcement learning motion planning method is proposed,which can help agents reach goals while avoiding collisions without explicit communication.Firstly,the concept of intention is introduced into the multi-agent motion planning problem by combining the visual images with the history maps to predict the intentions of agents,so that agents can anticipate the actions of other agents and thus collaborate effectively.Secondly,a convolutional neural network architecture based on attention mechanism is designed.This network architecture can be used to predict the intentions of agents and select the actions of agents,filtering the useful visual input information while reducing the reliance on communication for multi-agent cooperation.Thirdly,a value-based deep reinforcement learning algorithm is proposed to learn the motion planning strategy.By improving the objective function and the calculation of the Q values,the strategy is made more stable.Tested in six different PyBullet simulation scenes,the experimental results demonstrate that the proposed method improves the cooperation efficiency of multi-agent teams by an average of 10.74% with significant performance advantages compared to other advanced multi-agent motion planning methods.

Key words: Intention, Attention mechanism, Multi-agent system, Motion planning, Deep reinforcement learning

CLC Number: 

  • TP391
[1]GUPTA J K,EGOROV M,KOCHENDERFER M.Cooperative multi-agent control using deep reinforcement learning[C]//International Conference on Autonomous Agents and Multiagent Systems.Cham:Springer,2017:66-83.
[2]BUSONIU L,BABUSKA R,DE SCHUTTER B.Multi-agentreinforcement learning:A survey[C]//2006 9th International Conference on Control,Automation,Robotics and Vision.IEEE,2006:1-6.
[3]HERNANDEZ-LEAL P,KARTAL B,TAYLOR M E.A survey and critique of multiagent deep reinforcement learning[J].Autonomous Agents and Multi-Agent Systems,2019,33(6):750-797.
[4]HOLTE R C,PEREZ M B,ZIMMER R M,et al.HierarchicalA*:Searching abstraction hierarchies efficiently[C]//AAAI/IAAI,Vol.1.1996:530-535.
[5]DORIGO M,MANIEZZO V,COLORNI A.The ant system:An autocatalytic optimizing process[J].Clustering,1991,3(12):340.
[6]KHATIB O.Real-time obstacle avoidance system for manipula-tors and mobile robots[C]//Proceedings of the 1985 IEEE International Conference on Robotics and Automation.St.Louis,MO,USA,1985:25-28.
[7]PU Z Q,YI J Q,LIU Z,et al.A review of collaborative know-ledge and Data driven swarm intelligent decision making[J].Acta Automatica Sinica,2022,48(3):1-17.
[8]NGUYEN T T,NGUYEN N D,NAHAVANDI S.Deep reinforcement learning for multiagent systems:A review of challenges,solutions,and applications[J].IEEE Transactions on Cybernetics,2020,50(9):3826-3839.
[9]FOERSTER J,ASSAEL I A,DE FREITAS N,et al.Learning to communicate with deep multi-agent reinforcement learning[J].Advances in Neural Information Processing Systems,2016,29:2137-2145.
[10]SARTORETTI G,KERR J,SHI Y,et al.Primal:Pathfindingvia reinforcement and imitation multi-agent learning[J].IEEE Robotics and Automation Letters,2019,4(3):2378-2385.
[11]HAN R,CHEN S,HAO Q.Cooperative multi-robot navigation in dynamic environment with deep reinforcement learning[C]//2020 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2020:448-454.
[12]SINGH A,JAIN T,SUKHBAATAR S.Learning when to communicate at scale in multiagent cooperative and competitive tasks[J].arXiv:1812.09755,2018.
[13]WU J,SUN X,ZENG A,et al.Spatial intention maps for multi-agent mobile manipulation[C]//2021 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2021:8749-8756.
[14]JARADAT M A K,AL-ROUSAN M,QUADAN L.Reinforce-ment based mobile robot navigation in dynamic environment[J].Robotics and Computer-Integrated Manufacturing,2011,27(1):135-149.
[15]DUGULEANA M,MOGAN G.Neural networks based rein-forcement learning for mobile robots obstacle avoidance[J].Expert Systems with Applications,2016,62:104-115.
[16]XIE L,WANG S,MARKHAM A,et al.Towards monocular vision based obstacle avoidance through deep reinforcement lear-ning[J].arXiv:1706.09829,2017.
[17]KAHN G,VILLAFLOR A,DING B,et al.Self-supervised deep reinforcement learning with generalized computation graphs for robot navigation[C]//2018 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2018:5129-5136.
[18]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Playing atari with deep reinforcement learning[J].arXiv:1312.5602,2013.
[19]SUNEHAG P,LEVER G,GRUSLYS A,et al.Value-decompo-sition networks for cooperative multi-agent learning[J].arXiv:1706.05296,2017.
[20]LONG P,FAN T,LIAO X,et al.Towards optimally decentra-lized multi-robot collision avoidance via deep reinforcement lear-ning[C]//2018 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2018:6252-6259.
[21]LIN J,YANG X,ZHENG P,et al.End-to-end decentralizedmulti-robot navigation in unknown complex environments via deep reinforcement learning[C]//2019 IEEE International Conference on Mechatronics and Automation(ICMA).IEEE,2019:2493-2500.
[22]MOHSENI-KABIR A,ISELE D,FUJIMURA K.Interaction-aware multi-agent reinforcement learning for mobile agents with individual goals[C]//2019 International Conference on Robotics and Automation(ICRA).IEEE,2019:3370-3376.
[23]WANG B,LIU Z,LI Q,et al.Mobile robot path planning in dynamic environments through globally guided reinforcement learning[J].IEEE Robotics and Automation Letters,2020,5(4):6932-6939.
[24]LIU Z,CHEN B,ZHOU H,et al.Mapper:Multi-agent pathplanning with evolutionary reinforcement learning in mixed dynamic environments[C]//2020 IEEE/RSJ International Confe-rence on Intelligent Robots and Systems(IROS).IEEE,2020:11748-11754.
[25]WANG D,DENG H.Multirobot coordination with deep rein-forcement learning in complex environments[J].Expert Systems with Applications,2021,180:115128.
[26]LOWE R,WU Y I,TAMAR A,et al.Multi-agent actor-critic for mixed cooperative-competitive environments[J].Advances in Neural Information Processing Systems,2017(30):6379-6390.
[27]VINYALS O,BABUSCHKIN I,CZARNECKI W M,et al.Grandmaster level in StarCraft II using multi-agent reinforcement learning[J].Nature,2019,575(7782):350-354.
[28]FOERSTER J,NARDELLI N,FARQUHAR G,et al.Stabili-sing experience replay for deep multi-agent reinforcement lear-ning[C]//International Conference on Machine Learning.PMLR,2017:1146-1155.
[29]QI S,ZHU S C.Intent-aware multi-agent reinforcement learning[C]//2018 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2018:7533-7540.
[30]FU J,LI W,DU J,et al.A multiscale residual pyramid attention network for medical image fusion[J].Biomedical Signal Proces-sing and Control,2021,66:102488.
[31]ZHAI Y,DING B,LIU X,et al.Decentralized multi-robot colli-sion avoidance in complex scenarios with selective communication[J].IEEE Robotics and Automation Letters,2021,6(4):8379-8386.
[32]WANG X,LIAN L,YU S X.Unsupervised visual attention and invariance for reinforcement learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:6677-6687.
[33]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[34]WOO S,PARK J,LEE J Y,et al.CBAM:Convolutional blockattention module[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:3-19.
[35]VAN HASSELT H,GUEZ A,SILVER D.Deep reinforcement learning with double q-learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2016.
[1] LIU Xingguang, ZHOU Li, ZHANG Xiaoying, CHEN Haitao, ZHAO Haitao, WEI Jibo. Edge Intelligent Sensing Based UAV Space Trajectory Planning Method [J]. Computer Science, 2023, 50(9): 311-317.
[2] LI Ke, YANG Ling, ZHAO Yanbo, CHEN Yonglong, LUO Shouxi. EGCN-CeDML:A Distributed Machine Learning Framework for Vehicle Driving Behavior Prediction [J]. Computer Science, 2023, 50(9): 318-330.
[3] LIN Xinyu, YAO Zewei, HU Shengxi, CHEN Zheyi, CHEN Xing. Task Offloading Algorithm Based on Federated Deep Reinforcement Learning for Internet of Vehicles [J]. Computer Science, 2023, 50(9): 347-356.
[4] YI Liu, GENG Xinyu, BAI Jing. Hierarchical Multi-label Text Classification Algorithm Based on Parallel Convolutional Network Information Fusion [J]. Computer Science, 2023, 50(9): 278-286.
[5] LUO Yuanyuan, YANG Chunming, LI Bo, ZHANG Hui, ZHAO Xujian. Chinese Medical Named Entity Recognition Method Incorporating Machine ReadingComprehension [J]. Computer Science, 2023, 50(9): 287-294.
[6] ZHANG Yian, YANG Ying, REN Gang, WANG Gang. Study on Multimodal Online Reviews Helpfulness Prediction Based on Attention Mechanism [J]. Computer Science, 2023, 50(8): 37-44.
[7] JIN Tiancheng, DOU Liang, ZHANG Wei, XIAO Chunyun, LIU Feng, ZHOU Aimin. OJ Exercise Recommendation Model Based on Deep Reinforcement Learning and Program Analysis [J]. Computer Science, 2023, 50(8): 58-67.
[8] TENG Sihang, WANG Lie, LI Ya. Non-autoregressive Transformer Chinese Speech Recognition Incorporating Pronunciation- Character Representation Conversion [J]. Computer Science, 2023, 50(8): 111-117.
[9] WANG Jiahao, ZHONG Xin, LI Wenxiong, ZHAO Dexin. Human Activity Recognition with Meta-learning and Attention [J]. Computer Science, 2023, 50(8): 193-201.
[10] XIONG Liqin, CAO Lei, CHEN Xiliang, LAI Jun. Value Factorization Method Based on State Estimation [J]. Computer Science, 2023, 50(8): 202-208.
[11] WANG Yu, WANG Zuchao, PAN Rui. Survey of DGA Domain Name Detection Based on Character Feature [J]. Computer Science, 2023, 50(8): 251-259.
[12] YAN Mingqiang, YU Pengfei, LI Haiyan, LI Hongsong. Arbitrary Image Style Transfer with Consistent Semantic Style [J]. Computer Science, 2023, 50(7): 129-136.
[13] GAO Xiang, TANG Jiqiang, ZHU Junwu, LIANG Mingxuan, LI Yang. Study on Named Entity Recognition Method Based on Knowledge Graph Enhancement [J]. Computer Science, 2023, 50(6A): 220700153-6.
[14] ZHANG Tao, CHENG Yifei, SUN Xinxu. Graph Attention Networks Based on Causal Inference [J]. Computer Science, 2023, 50(6A): 220600230-9.
[15] CUI Lin, CUI Chenlu, LIU Zhengwei, XUE Kai. Speech Emotion Recognition Based on Improved MFCC and Parallel Hybrid Model [J]. Computer Science, 2023, 50(6A): 220800211-7.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!