计算机科学 ›› 2025, Vol. 52 ›› Issue (4): 40-48.doi: 10.11896/jsjkx.241000084
孔超1, 王维1, 皇苏斌1, 张义1, 孟丹2
KONG Chao1, WANG Wei1, HUANG Subin1, ZHANG Yi1, MENG Dan2
摘要: 无人艇自主避障已成为其拓展应用场景的一项关键挑战。传统方法下无人艇避障主要依赖于对环境的精细建模,然而,复杂海洋环境下无人艇难以获取完整的感知状态,导致模型精度不足。针对上述问题,提出了一种改进近端策略优化的无人艇自主避障方法。首先,构建了基于马尔可夫决策过程的无人艇自主避障决策框架;然后,在近端策略优化算法中融合了循环神经网络的感知表征增强模块,提高无人艇对时序环境感知的记忆能力;最后,结合奖励重塑机制设计一套自主避障奖励函数,提升无人艇避障策略的优化速度。为了验证算法的有效性,在三维仿真平台下构建了典型无人艇自主避障算法的验证场景。实验结果表明,基于改进近端策略优化方法能够实现无人艇无碰撞自主航行,在模型收敛速度、碰撞率与超时率上均优于传统近端策略算法。
中图分类号:
[1]BARRERA C,PADRON I,LUIS F S,et al.Trends and challenges in unmanned surface vehicles(USV):From survey to shipping[J].TransNav:International Journal on Marine Navigation and Safety of Sea Transportation,2021,15(1):135-142. [2]YAN R,PANG S,SUN H,et al.Development and missions of unmanned surface vehicle[J].Journal of Marine Science and Application,2010,9:451-457. [3]POLVARA R,SHARMA S,WAN J,et al.Obstacle avoidance approaches for autonomous navigation of unmanned surface vehicles[J].The Journal of Navigation,2018,71(1):241-256. [4]GUAN W,WANG K.Autonomous collision avoidance of un-manned surface vehicles based on improved A-star and dynamic window approach algorithms[J].IEEE Intelligent Transportation Systems Magazine,2023,15(3):36-50. [5]ZHANG T,LI Q,ZHANG C,et al.Current trends in the deve-lopment of intelligent unmanned autonomous systems[J].Frontiers of information technology & electronic engineering,2017,18:68-85. [6]MA Y,WANG Z,YANG H,et al.Artificial intelligence applications in the development of autonomous vehicles:A survey[J].IEEE/CAA Journal of Automatica Sinica,2020,7(2):315-329. [7]DONG S,WANG P,ABBAS K.A survey on deep learning and its applications[J].Computer Science Review,2021,40:100379. [8]YE D,LIU Z,SUN M,et al.Mastering complex control in moba games with deep reinforcement learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:6672-6679. [9]LU J,HAN L,WEI Q,et al.Event-triggered deep reinforcement learning using parallel control:A case study in autonomous dri-ving[J].IEEE Transactions on Intelligent Vehicles,2023,8(4):2821-2831. [10]SINGH B,KUMAR R,SINGH V P.Reinforcement learning in robotic applications:a comprehensive survey[J].Artificial Intelligence Review,2022,55(2):945-990. [11]SCHULMAN J,WOLSKI F,DHARIWAL P,et al.Proximalpolicy optimization algorithms[J].arXiv:1707.06347,2017. [12]GUAN W,WANG K.Autonomous collision avoidance of unmanned surface vehicles based on improved A-star and dynamic window approach algorithms[J].IEEE Intelligent Transportation Systems Magazine,2023,15(3):36-50. [13]BAI X,LI B,XU X,et al.USV path planning algorithm based on plant growth[J].Ocean Engineering,2023,273:113965. [14]YU J,YANG M,ZHAO Z,et al.Path planning of unmanned surface vessel in an unknown environment based on improved D* Lite algorithm[J].Ocean Engineering,2022,266:112873. [15]OUYANG Z,WANG H,HUANG Y,et al.Path planning technologies for USV formation based on improved RRT[J].Chinese Journal of Ship Research,2020,15(3):18-24. [16]ZHAO Y,MA Y,HU S.USV formation and path-followingcontrol via deep reinforcement learning with random braking[J].IEEE Transactions on Neural Networks and Learning Systems,2021,32(12):5468-5478. [17]WU X,CHEN H,CHEN C,et al.The autonomous navigationand obstacle avoidance for USVs with ANOA deep reinforcement learning method[J].Knowledge-Based Systems,2020,196:105201. [18]XU X,LU Y,LIU X,et al.Intelligent collision avoidance algorithms for USVs via deep reinforcement learning under COLREGs[J].Ocean Engineering,2020,217:107704. [19]GAN W,QU X,SONG D,et al.Multi-usv cooperative chasing strategy based on obstacles assistance and deep reinforcement learning[J].IEEE Transactions on Automation Science and Engineering,2023,21(4)::5895-5910. [20]WANG W,LUO X,LI Y,et al.Unmanned surface vessel obstacle avoidance with prior knowledge‐based reward shaping[J].Concurrency and Computation:Practice and Experience,2021,33(9):e6110. [21]RAMACHANDRAN P,ZOPH B,LE Q V.Searching for activation functions[J].arXiv:1710.05941,2017. [22]PHANICHRAKSAPHONG V,TSAI W H.An Empirical Ge-neration Technique on Background Music Using Gated Recurrent Neural Networks[C]//2023 International Conference on Consumer Electronics-Taiwan.IEEE,2023:691-692. [23]NG A Y,HARADA D,RUSSELL S.Policy invariance under reward transformations:Theory and application to reward shaping[C]//Proceedings of the Sixteenth International Conference on Machine Learning.1999:278-287. [24]ALMÓN-MANZANO L,PASTOR-VARGAS R,TRONCOSO J M C.Deep reinforcement learning in agents’ training:Unity ML-agents[C]//International Work-Conference on the Interplay Between Natural and Artificial Computation.Cham:SpringerInternational Publishing,2022:391-400. [25]LILLICRAP T P.Continuous control with deep reinforcement learning[J].arXiv:1509.02971,2015. |
|