基于深度确定性策略梯度与注意力Critic的多智能体协同清障算法

doi:10.11896/jsjkx.230600129

Computer Science ›› 2024, Vol. 51 ›› Issue (7): 319-326.doi: 10.11896/jsjkx.230600129

• Artificial Intelligence • Previous Articles Next Articles

Multi-agent Cooperative Algorithm for Obstacle Clearance Based on Deep Deterministic PolicyGradient and Attention Critic

WANG Xianwei¹, FENG Xiang^1,2, YU Huiqun^1,2

1 Department of Computer Science and Engineering,East China University of Science and Technology,Shanghai,200237,China
2 Shanghai Engineering Research Center of Smart Energy,Shanghai,200237,China

Received:2023-06-16 Revised:2023-11-16 Online:2024-07-15 Published:2024-07-10
About author:WANG Xianwei,born in 1999,postgra-duate,is a member of CCF(No.P2627G).His main research interests include reinforcement learning and robot navigation.
FENG Xiang,born in 1977,Ph.D,professor,is a member of CCF(No.16665M).Her main research interests include distributed swarm intelligence and evolutionary computing,reinforcement learning,and big data intelligence.
Supported by:
National Natural Science Foundation of China(62276097),Key Program of National Natural Science Foundation of China(62136003),National Key Research and Development Program of China( 2020YFB1711700),Special Fund for Information Development of Shanghai Economic and Information Commission(XX-XXFZ-02-20-2463) and Scientific Research Program of Shanghai Science and Technology Commission(21002411000).

Abstract

Abstract: Dynamic obstacles have always been a key factor hindering the development of autonomous navigation for agents.Obstacle avoidance and obstacle clearance are two effective methods to address the issue.In recent years,multi-agent obstacle avoi-dance(collision avoidance) has been an active research area,and there are numerous excellent multi-agent obstacle avoidance algorithms.However,the problem of multi-agent obstacle clearance remains relatively unknown,and the corresponding algorithms for multi-agent obstacle clearance are scarce.To address the issue of multi-agent obstacle clearance,a multi-agent cooperative algorithm for obstacle clearance based on deep deterministic policy gradient and attention Critic(MACOC) is proposed.Firstly,the first multi-agent cooperative environment model for obstacle clearance is created,and the kinematic models of the agents and dynamic obstacles are defined.Four simulation environments are constructed based on different numbers of agents and dynamic obstacles.Secondly,the process of obstacle clearance cooperatively by multi-agent is defined as a Markov decision process(MDP) model.The state space,action space,and reward function for multi-agent are constructed.Finally,a multi-agent cooperative algorithm for obstacle clearance based on deep deterministic policy gradient and attention critic is proposed,and it is compared with classical multi-agent algorithms in the simulated environments for obstacle clearance.Experimental results show that,the proposed MACOC algorithm has a higher success rate in obstacle clearance,faster speed,and better adaptability to complex environments compared to the compared algorithms.

Key words: Reinforcement learning algorithm, Markov decision process, Multi-agent cooperative control, Dynamic obstacle clea-rance, Attention mechanism

CLC Number:

TP183

WANG Xianwei, FENG Xiang, YU Huiqun. Multi-agent Cooperative Algorithm for Obstacle Clearance Based on Deep Deterministic PolicyGradient and Attention Critic[J].Computer Science, 2024, 51(7): 319-326.

References

[1]NTAKOLIA C,MOUSTAKIDIS S,SIOURAS A.Autonomouspath planning with obstacle avoidance for smart assistive systems[J].Expert Systems with Applications,2023,213:119049.
[2]CORNO M,GIMONDI A,PANZANI G,et al.A non-optimization-based dynamic path planning for autonomous obstacle avoidance[J].IEEE Transactions on Control Systems Technology,2022,31(2):722-734.
[3]DING J,GAO L,LIU W,et al.Monocular camera-based complex obstacle avoidance via efficient deep reinforcement learning[J].IEEE Transactions on Circuits and Systems for Video Technology,2022,33(2):756-770.
[4]LI Z,LI J,WANG W.Path Planning and Obstacle AvoidanceControl for Autonomous Multi-Axis Distributed Vehicle Based on Dynamic Constraints[J].arXiv:1312.7572,2013.
[5]NAYYAR M,WAGNER A R.Aiding Emergency EvacuationsUsing Obstacle-Aware Path Clearing[C]//2021 IEEE International Conference on Advanced Robotics and Its Social Impacts(ARSO).IEEE,2021:7-14.
[6]LU X,JIA Y.Scaled Event-Triggered Resilient Consensus Control of Continuous-Time Multi-Agent Systems Under Byzantine Agents[J].IEEE Transactions on Network Science and Engineering,2022,10(2):1157-1174.
[7]ORR J,DUTTA A.Multi-Agent Deep Reinforcement Learning for Multi-Robot Applications:A Survey[J].Sensors,2023,23(7):3625-3625.
[8]YU Y,GUO J,CHADLI M,et al.Distributed adaptive fuzzy formation control of uncertain multiple unmanned aerial vehicles with actuator faults and switching topologies[J].IEEE Transactions on Fuzzy Systems,2022,31(3):919-929.
[9]DENG Z,YANG K,SHEN W,et al.Cooperative Platoon Formation of Connected and Autonomous Vehicles:Toward Efficient Merging Coordination at Unsignalized Intersections[J].IEEE Transactions on Intelligent Transportation Systems,2023,24(5):5625-5639.
[10]HAO Q,XU F,CHEN L,et al.Hierarchical Multi-agent Model for Reinforced Medical Resource Allocation with Imperfect Information[J].ACM Transactions on Intelligent Systems and Technology,2022,14(1):1-27.
[11]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control throughdeep reinforcement learning[J].Nature,2015,518(7540):529-533.
[12]SILVER D,HUANG A,MADDISON C J,et al.Mastering thegame of Go with deep neural networks and tree search[J].Nature,2016,529(7587):484-489.
[13]SCHULMAN J,WOLSKI F,DHARIWAL P,et al.Proximalpolicy optimization algorithms[J].arXiv:1707.06347,2017.
[14]LI P Y,TANG H Y,YANG T P,et al.Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration[C] //2022 International Conference on Machine Learning.2022:12979-12997.
[15]FOERSTER J,FARQUHAR G,AFOURAS T,et al.Counterfactual multi-agent policy gradients[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018.
[16]YU C,VELU A,VINITSKY E,et al.The surprising effectiveness of ppo in cooperative multi-agent games[J].Advances in Neural Information Processing Systems,2022,35:24611-24624.
[17]LOWE R,WU Y,TAMAR A,et al.Multi-agent actor-critic for mixed cooperative-competitive environments[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems(NIPS'17).2017:6382-6393.
[18]FAFSAR M M,CRUMP T,FAR B.Reinforcement learningbased recommender systems:A survey[J].ACM Computing Surveys,2022,55(7):1-38.
[19]ZHAO F,WANG Z,WANG L,et al.A multi-agent reinforcement learning driven artificial bee colony algorithm with the central controller[J].Expert Systems with Applications,2023,219:119672.
[20]REN J,GUO S,CHEN F.Orientation-preserving rewards' ba-lancing in reinforcement learning[J].IEEE Transactions on Neural Networks and Learning Systems,2021,33(11):6458-6472.
[21]OROOJLOOY A,HAJINEZHAD D.A review of cooperativemulti-agent deep reinforcement learning[J].Applied Intelligence,2023,53(11):13677-13722.

Related Articles 15

[1]	FAN Yi, HU Tao, YI Peng. Host Anomaly Detection Framework Based on Multifaceted Information Fusion of SemanticFeatures for System Calls [J]. Computer Science, 2024, 51(7): 380-388.
[2]	BAI Wenchao, BAI Shuwen, HAN Xixian, ZHAO Yubo. Efficient Query Workload Prediction Algorithm Based on TCN-A [J]. Computer Science, 2024, 51(7): 71-79.
[3]	ZENG Zihui, LI Chaoyang, LIAO Qing. Multivariate Time Series Anomaly Detection Algorithm in Missing Value Scenario [J]. Computer Science, 2024, 51(7): 108-115.
[4]	YANG Zhenzhen, WANG Dongtao, YANG Yongpeng, HUA Renyu. Multi-embedding Fusion Based on top-N Recommendation [J]. Computer Science, 2024, 51(7): 140-145.
[5]	HU Haibo, YANG Dan, NIE Tiezheng, KOU Yue. Graph Contrastive Learning Incorporating Multi-influence and Preference for Social Recommendation [J]. Computer Science, 2024, 51(7): 146-155.
[6]	LI Jiaying, LIANG Yudong, LI Shaoji, ZHANG Kunpeng, ZHANG Chao. Study on Algorithm of Depth Image Super-resolution Guided by High-frequency Information ofColor Images [J]. Computer Science, 2024, 51(7): 197-205.
[7]	LOU Zhengzheng, ZHANG Xin, HU Shizhe, WU Yunpeng. Foggy Weather Object Detection Method Based on YOLOX_s [J]. Computer Science, 2024, 51(7): 206-213.
[8]	YAN Jingtao, LI Yang, WANG Suge, PAN Bangze. Overlap Event Extraction Method with Language Granularity Fusion Based on Joint Learning [J]. Computer Science, 2024, 51(7): 287-295.
[9]	WEI Ziang, PENG Jian, HUANG Feihu, JU Shenggen. Text Classification Method Based on Multi Graph Convolution and Hierarchical Pooling [J]. Computer Science, 2024, 51(7): 303-309.
[10]	ZHU Yuliang, LIU Juntao, RAO Ziyun, ZHANG Yi, CAO Wanhua. Knowledge Reasoning Model Combining HousE with Attention Mechanism [J]. Computer Science, 2024, 51(6A): 230600209-8.
[11]	BAI Yu, WANG Xinzhe. Study on Hypernymy Recognition Based on Combined Training of Attention Mechanism and Prompt Learning [J]. Computer Science, 2024, 51(6A): 230700226-5.
[12]	WANG Guogang, DONG Zhihao. Lightweight Image Semantic Segmentation Based on Attention Mechanism and Densely AdjacentPrediction [J]. Computer Science, 2024, 51(6A): 230300204-8.
[13]	ZHANG Le, YU Ying, GE Hao. Mural Inpainting Based on Fast Fourier Convolution and Feature Pruning Coordinate Attention [J]. Computer Science, 2024, 51(6A): 230400083-9.
[14]	SUN Yang, DING Jianwei, ZHANG Qi, WEI Huiwen, TIAN Bowen. Study on Super-resolution Image Reconstruction Using Residual Feature Aggregation NetworkBased on Attention Mechanism [J]. Computer Science, 2024, 51(6A): 230600039-6.
[15]	QUE Yue, GAN Menghan, LIU Zhiwei. Object Detection with Receptive Field Expansion and Multi-branch Aggregation [J]. Computer Science, 2024, 51(6A): 230600151-6.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Multi-agent Cooperative Algorithm for Obstacle Clearance Based on Deep Deterministic PolicyGradient and Attention Critic

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0