改进双延迟深度确定性策略梯度的多船协调避碰决策

doi:10.11896/jsjkx.221000131

Computer Science ›› 2023, Vol. 50 ›› Issue (11): 269-281.doi: 10.11896/jsjkx.221000131

• Artificial Intelligence • Previous Articles Next Articles

Multi-ship Coordinated Collision Avoidance Decision Based on Improved Twin Delayed Deep Deterministic Policy Gradient

HUANG Renxian^1,2,3, LUO Liang^1,2, YANG Meng⁴, LIU Weiqin¹

1 School of Naval Architechure,Ocean and Energy Power Engineering,Wuhan University of Technology,Wuhan 430064,China
2 Key Laboratory of High Performance Ship Technology(Wuhan University of Technology),Ministry of Education,Wuhan 430064,China
3 Sanya Science and Education Innovation Park of Wuhan University of Technology,Sanya,Hainan 572019,China
4 China Ship Development and Design Center,Wuhan 430060,China

Received:2022-10-17 Revised:2023-03-14 Online:2023-11-15 Published:2023-11-06
About author:HUANG Renxian,born in 1998,postgraduate.His main research interests include artificial intelligence and data processing.LUO Liang,born in 1980,Ph.D,asso-ciate professor,Ph.D supervisor.His main research interests include system simulation integration and ship-related digital technology and high-performance computing.
Supported by:
National Defense Basic Scientific Research Program of China(JCKY2020206B037).

Abstract

Abstract: At present,most models of collision avoidance algorithms take ships as single agent to make collision avoidance decisions,without considering the coordinated avoidance between ships.In the scenario of multi-ship meeting,it will lead to poor avoidance effect by relying on single ships.Therefore,this paper proposes a softmax deep double deterministic policy gradients(SD3) multi-ship cooperative collision avoidance model based on improved twin delayed deep deterministic policy gradient(TD3).The time collision model and space collision model are constructed to quantitatively analyze the ship collision risk based on the time and space factors of ship navigation safety.On this basis,the ship domain model based on the situation of collision and the dynamic change of ship speed vector is used to qualitatively analyze the ship collision risk.The reward function is designed using the constraints of ship objective guidance,course angle change,course keeping,collision risk and international regulations for preventing collisions at sea(COLREGs),combined with the typical encounter situation in COLREGS,the collision avoidance simulation is carried out for the encounter scene with multi-situation coexistence of encounter,head-on,chase and cross encounter.Ablation experiment shows that the softmax operator improves the performance of SD3 algorithm,making it have better decision-ma-king effect in ship coordinated collision avoidance and compared with other reinforcement learning algorithms for learning efficiency and learning effect.Experimental results show that the SD3 algorithm can effectively make accurate collision avoidance decisions and outperform other reinforcement learning algorithms in performance in complex multi-situation encounter scenarios.

Key words: Vessel encounter, Coordinated collision avoidance, Intelligent decision-making, Twin delayed deep deterministic policy gradient(TD3), Softmax deep double deterministic policy gradients(SD3), Reinforcement learning

CLC Number:

TP391.9

HUANG Renxian, LUO Liang, YANG Meng, LIU Weiqin. Multi-ship Coordinated Collision Avoidance Decision Based on Improved Twin Delayed Deep Deterministic Policy Gradient[J].Computer Science, 2023, 50(11): 269-281.

References

[1]SONG Y.Research on Ship Path Planning Algorithm [D].Wuhan:Wuhan University of Technology,2018.
[2]ZHAO Y X,LI W,SHI P.A real-time collision avoidance lear-ning system for Unmanned Surface Vessels[J].Neurocomputing,2016,182:255-266.
[3]LAZAROWSKA A.A new deterministic approach in a decision support system for ship's trajectory planning[J].Expert Systems with Applications,2017,71:469-478.
[4]LISOWSKI J,MOHAMED-SEGHIR M.Comparison of Computational Intelligence Methods Based on Fuzzy Sets and Game Theory in the Synthesis of Safe Ship Control Based on Information from a Radar ARPA System[J].Remote Sensing,2019,11(1):82.
[5]LI S J,LIU J L,NEGENBORN R R.Distributed coordination for collision avoidance of multiple ships considering ship maneuverability[J].Ocean Engineering,2019,181:212-226.
[6]ZHANG J F,ZHANG D,YAN,X P,et al.A distributed anti-collision decision support formulation in multi-ship encounter situations under COLREGs[J].Ocean Engineering,2015,105:336-348.
[7]OUYANG Z L,WANG H D,WANG J Y,et al.Automatic collision avoidance algorithm for unmanned surface craft based on improved Bi-RRT [J].China Ship Research,2019,14(6):8-14.
[8]WANG C B,ZHANG X Y,ZHANG J W,et al.Intelligent Collision avoidance Decision method for Unmanned Ships in Unknown Environment [J].China Ship Research,2018,13(6):72-77.
[9]SHEN H Q,HASHIMOTO H,MATSUDA A,et al.Automatic collision avoidance of multiple ships based on deep Q-learning[J].Applied Ocean Research.,2019Vol.86:268-288.
[10]CHENG Y,ZHANG W D.Concise deep reinforcement learning obstacle avoidance for underactuated unmanned marine vessels[J].Neurocomputing,2018,272:63-73.
[11]ZHOU Y,YUAN C P,XIE H C,et al.Collision avoidancepath planning of tourist ship based on DDPG algorithm[J].Chinese Journal of Ship Research,2021,16(6):19-26,60.
[12]XIE S,CHU X M,ZHENG M,et al.A composite learning me-thod for multi-ship collision avoidance based on reinforcement learning and inverse control[J].Neurocomputing,2020(411):375-392.
[13]LIU Z,ZHOU Z Z,ZHANG M Y,et al.A Twin Delayed Deep Deterministic Policy Gradient Method for Collision Avoidance of Autonomous Ships [J].Traffic Information and Safety,2022,40(3):60-74.
[14]XU Z.Research and Application of Ship Collision Avoidance Decision Simulation Platform [D].Dalian:Dalian Maritime University,2015.
[15]REN P.Research on Collision Avoidance Decision Based on Ship Collision Risk [D].Dalian:Dalian Maritime University,2015.
[16]TAM C,BUCKNALL R.Collision risk assessment for ships[J].Journal of Marine Science and Technology,2010,15(3):257-270.
[17]LING P,CAI Q P,HUANG L B.Softmax Deep Double Deterministic Policy Gradients[J].arXiv:2010.09177,2020.

Related Articles 15

[1]	LIU Xingguang, ZHOU Li, ZHANG Xiaoying, CHEN Haitao, ZHAO Haitao, WEI Jibo. Edge Intelligent Sensing Based UAV Space Trajectory Planning Method [J]. Computer Science, 2023, 50(9): 311-317.
[2]	LIN Xinyu, YAO Zewei, HU Shengxi, CHEN Zheyi, CHEN Xing. Task Offloading Algorithm Based on Federated Deep Reinforcement Learning for Internet of Vehicles [J]. Computer Science, 2023, 50(9): 347-356.
[3]	JIN Tiancheng, DOU Liang, ZHANG Wei, XIAO Chunyun, LIU Feng, ZHOU Aimin. OJ Exercise Recommendation Model Based on Deep Reinforcement Learning and Program Analysis [J]. Computer Science, 2023, 50(8): 58-67.
[4]	XIONG Liqin, CAO Lei, CHEN Xiliang, LAI Jun. Value Factorization Method Based on State Estimation [J]. Computer Science, 2023, 50(8): 202-208.
[5]	ZHANG Naixin, CHEN Xiaorui, LI An, YANG Leyao, WU Huaming. Edge Offloading Framework for D2D-MEC Networks Based on Deep Reinforcement Learningand Wireless Charging Technology [J]. Computer Science, 2023, 50(8): 233-242.
[6]	XING Linquan, XIAO Yingmin, YANG Zhibin, WEI Zhengmin, ZHOU Yong, GAO Saijun. Spacecraft Rendezvous Guidance Method Based on Safe Reinforcement Learning [J]. Computer Science, 2023, 50(8): 271-279.
[7]	ZENG Qingwei, ZHANG Guomin, XING Changyou, SONG Lihua. Intelligent Attack Path Discovery Based on Hierarchical Reinforcement Learning [J]. Computer Science, 2023, 50(7): 308-316.
[8]	LIN Xiangyang, XING Qinghua, XING Huaixi. Study on Intelligent Decision Making of Aerial Interception Combat of UAV Group Based onMADDPG [J]. Computer Science, 2023, 50(6A): 220700031-7.
[9]	SHI Liang, WEN Liangming, LEI Sheng, LI Jianhui. Virtual Machine Consolidation Algorithm Based on Decision Tree and Improved Q-learning by Uniform Distribution [J]. Computer Science, 2023, 50(6): 36-44.
[10]	WANG Hanmo, ZHENG Shijie, XU Ruonan, GUO Bin, WU Lei. Self Reconfiguration Algorithm of Modular Robot Based on Swarm Agent Deep Reinforcement Learning [J]. Computer Science, 2023, 50(6): 266-273.
[11]	MIAO Kuan, LI Chongshou. Optimization Algorithms for Job Shop Scheduling Problems Based on Correction Mechanisms and Reinforcement Learning [J]. Computer Science, 2023, 50(6): 274-282.
[12]	ZHANG Qiyang, CHEN Xiliang, CAO Lei, LAI Jun, SHENG Lei. Survey on Knowledge Transfer Method in Deep Reinforcement Learning [J]. Computer Science, 2023, 50(5): 201-216.
[13]	YU Ze, NING Nianwen, ZHENG Yanliu, LYU Yining, LIU Fuqiang, ZHOU Yi. Review of Intelligent Traffic Signal Control Strategies Driven by Deep Reinforcement Learning [J]. Computer Science, 2023, 50(4): 159-171.
[14]	XU Linling, ZHOU Yuan, HUANG Hongyun, LIU Yang. Real-time Trajectory Planning Algorithm Based on Collision Criticality and Deep Reinforcement Learning [J]. Computer Science, 2023, 50(3): 323-332.
[15]	Cui ZHANG, En WANG, Funing YANG, Yong jian YANG , Nan JIANG. UAV Frequency-based Crowdsensing Using Grouping Multi-agentDeep Reinforcement Learning [J]. Computer Science, 2023, 50(2): 57-68.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Multi-ship Coordinated Collision Avoidance Decision Based on Improved Twin Delayed Deep Deterministic Policy Gradient

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0