基于碰撞危急程度和深度强化学习的实时轨迹规划算法

doi:10.11896/jsjkx.220100007

Abstract

Abstract: Real-time collision avoidance in dynamic environments is a challenge in trajectory planning of mobile robots. Focusing on environments with variable number of obstacles,this paper proposes a real-time trajectory planning algorithm,Crit-LSTM-DRL,based on long short-term memory(LSTM) and deep reinforcement learning(DRL). First,it predicts the time to the occurrence of a collision between an obstacle and the robot based on their states,and then computes the collision criticality of each obstacle with respect to the robot. Second,it generates the obstacle sequence based on the collision criticality and abstracts a fixed-dimension vector by LSTM to represent the environment. Finally,the robot state and the extracted vector are concatenated as the input of the DRL's value network to compute the value with respect to the system state. At any instant,for each action,it predicts the value of the next state based on the LSTM and DRL models and then the value of the current state; hence,the action generating the maximal value of the current state is selected to control the robot. To evaluate the performance of Crit-LSTM-DRL,it is first trained in three different environments and obtain three models: the model trained in the environment with 5 obstacles,the model trained in the environment with 10 obstacles,and the model trained in the environment with variable number of obstacles(1～10). The models then are tested in various environments containing different number of obstacles. To further investigate the effects of the interaction between an obstacle and the robot,this paper also takes the joint state of an obstacle and the robot as the state of the obstacle and trains another three models in the above training environments. Experimental results show the effectiveness and efficiency of Crit-LSTM-DRL.

Key words: Trajectory planning, Collision avoidance, Obstacle criticality, Deep reinforcement learning

CLC Number:

TP242

XU Linling, ZHOU Yuan, HUANG Hongyun, LIU Yang. Real-time Trajectory Planning Algorithm Based on Collision Criticality and Deep Reinforcement Learning[J].Computer Science, 2023, 50(3): 323-332.

References

[1]KHAMIS A,HUSSEIN A,ELMOGY A.Multi-robot task allocation:A review of the state-of-the-art[M]//Cooperative Robots and Sensor Networks 2015.Springer International Publi-shing,2015:31-51.
[2]LUO J L,NI H J,ZHOU M C.Control program design for automated guided vehicle systems via Petri nets[J].IEEE Transactions on Systems,Man,and Cybernetics:Systems,2014,45(1):44-55.
[3]ZHOU Y,HU H,LIU Y,et al.Collision and deadlock avoidance in multirobot systems:A distributed approach[J].IEEE Transa-ctions on Systems,Man,and Cybernetics:Systems,2017,47(7):1712-1726.
[4]ZHOU Y,HU H,LIU Y,et al.A distributed method to avoid higher-order deadlocks in multi-robot systems[J].Automatica,2020,112:108706.
[5]VAN DEN BERG J P,OVERMARS M H.Roadmap-based motion planning in dynamic environments[J].IEEE Transactions on Robotics,2005,21(5):885-897.
[6]MARBLE J D,BEKRIS K E.Asymptotically near-optimal planning with probabilistic roadmap spanners[J].IEEE Transactions on Robotics,2013,29(2):432-444.
[7]KLOETZER M,MAHULEA C,GONZALEZ R.Optimizing cell decomposition path planning for mobile robots using different metrics[C]//International Conference on System Theory,Control and Computing(ICSTCC).IEEE,2015:565-570.
[8]PIVTORAIKO M,KNEPPER R A,KELLY A.Differentiallyconstrained mobile robot motion planning in state lattices[J].Journal of Field Robotics,2009,26(3):308-333.
[9]BIRCHER A,ALEXIS K,SCHWESINGER U,et al.An incremental sampling-based approach to inspection planning:the ra-pidly exploring random tree of trees[J].Robotica,2017,35(6):1327-1340.
[10]TANNER H G,BODDU A.Multiagent navigation functions revisited[J].IEEE Transactions on Robotics,2012,28(6):1346-1359.
[11]VAN DEN BERG J,GUY S J,LIN M,et al.Reciprocal n-body collision avoidance [M]//Robotics research.Berlin:Springer,2011:3-19.
[12]ALONSO-MORA J,BEARDSLEY P,SIEGWART R.Cooperative collision avoidance for nonholonomic robots[J].IEEE Transactions on Robotics,2018,34(2):404-420.
[13]ABICHANDANI P,FORD G,BENSON H Y,et al.Mathematical programming for multi-vehicle motion planning problems[C]//IEEE International Conference on Robotics and Automation.IEEE,2012:3315-3322.
[14]ZHOU Y,HU H,LIU Y,et al.A real-time and fully distributed approach to motion planning for multirobot systems[J].IEEE Transactions on Systems,Man,and Cybernetics:Systems,2017,49(12):2636-2650.
[15]ZHOU Y,HU H,LIU Y,et al.Distributed approaches to motion control of multiple robots via discrete event systems[J].Control Theory & Applications,2018,35(1):110-120.
[16]WANG H,WANG X F,ZHANG B,et al.A method for planning the path of mobile robot moving on general terrain[J].Ruan Jian Xue Bao/Journal of Software,1995(3):173-178.
[17]ZHANG X D,ZHAN D C,WANG C Y,et al.Key technologies of indoor navigation based on heuristic path planning and IMU[J].Ruan Jian Xue Bao/Journal of Software,2015,26(S1):78-89.
[18]LIN Y S,LI Q S,LU P H,et al.Shelf and AGV path cooperative optimization algorithm used in intelligent warehousing[J].Ruan Jian Xue Bao/Journal of Software,2020,31(9):2770-2784.
[19]ZHU Q B.Ant algorithm for navigation of multi-robot movement in unknown environment[J].Journal of Software,2006,17(9):1890-1898.
[20]CHEN Y F,EVERETT M,LIU M,et al.Socially aware motion planning with deep reinforcement learning[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).IEEE,2017:1343-1350.
[21]CHEN C,LIU Y,KREISS S,et al.Crowd-robot interaction:Crowd-aware robot navigation with attention-based deep reinforcement learning[C]//International Conference on Robotics and Automation(ICRA).IEEE,2019:6015-6022.
[22]CHEN Y F,LIU M,EVERETT M,et al.Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning[C]//IEEE International Conference on Robotics and Automation(ICRA).IEEE,2017:285-292.
[23]EVERETT M,CHEN Y F,HOW J P.Motion planning among dynamic,decision-making agents with deep reinforcement lear-ning[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).IEEE,2018:3052-3059.
[24]BENCY M J,QURESHI A H,YIP M C.Neural path planning:Fixed time,near-optimal path generation via oracle imitation[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).IEEE,2019:3965-3972.
[25]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Playing atari with deep reinforcement learning[J].arXiv:1312.5602,2013.
[26]ZHU Y,ZHAO D,LI X.Iterative adaptive dynamic programming for solving unknown nonlinear zero-sum game based on online data[J].IEEE Transactions on Neural Networks and Learning Systems,2016,28(3):714-725.
[27]TAI L,PAOLO G,LIU M.Virtual-to-real deep reinforcement learning:Continuous control of mobile robots for mapless navigation[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).IEEE,2017:31-36.
[28]FENG S,SHU H,XIE B Q.Path planning for 3D environment based on improved deep reinforcement learning[J].Computer Applications and Software,2021,38(1):250-255.
[29]WANG K,BU X J,LI R F,et al.Path Planning for Robots Based on Deep Reinforcement Learning by Depth Constraint[J].Journal of Huazhong University of Science and Technology(Natural Science Edition),2018,46(12):77-82.
[30]LI H,QI Y M.A Path Planning Method for Robot Based on Deep Reinforcement Learning in Complex Environment[J].Application Research of Computers,2020,37(S1):129-131.
[31]SAWADA R,SATO K,MAJIMA T.Automatic ship collisionavoidance using deep reinforcement learning with LSTM in continuous action spaces[J].Journal of Marine Science and Techno-logy,2021,26(2):509-524.
[32]NAEEM M,RIZVI S T H,CORONATO A.A gentle introduction to reinforcement learning and its application in different fields[J].IEEE Access,2020,8:209320-209344.
[33]CHEN Z H,YANG Z H,WANG H B,et al.Recent Researches on Robot Autonomous Grasp Technology[J].Control and Decision,2008(9):961-968,975.
[34]LE CUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444.
[35]MOUSSA M A.Combining expert neural networks using reinforcement feedback for learning primitive grasping behavior[J].IEEE Transactions on Neural Networks,2004,15(3):629-638.
[36]ZHAO D B,SHAO K,ZHU Y H,et al.Review of deep reinforcement learning and discussions on the development of computer Go[J].Control Theory & Applications,2016,33(6):701-717.
[37]WAN L P,LAN X G,ZHANG H B,et al.A review of deep reinforcement learning theory and application[J].Pattern Recognition and Artificial Intelligence,2019,32(1):67-81.
[38]LIU Q,ZHAI J W,ZHANG Z Z,et al.A survey on deep reinforcement learning[J].Chinese Journal of Computers,2018,41(1):1-27.
[39]LIU J W,GAO F,LUO X L.Survey of deep reinforcementlearning based on value function and policy gradient[J].Chinese Journal of Computers,2019,42(6):1406-1438.
[40]LIU Z Y,MU Z X,SUN C Y.An overview on algorithms and applications of deep reinforcement learning[J].Chinese Journal of Intelligent Science and Technology,2020,2(4):314-326.
[41]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529-533.

Related Articles 15

[1]	Cui ZHANG, En WANG, Funing YANG, Yong jian YANG , Nan JIANG. UAV Frequency-based Crowdsensing Using Grouping Multi-agentDeep Reinforcement Learning [J]. Computer Science, 2023, 50(2): 57-68.
[2]	HUANG Yuzhou, WANG Lisong, QIN Xiaolin. Bi-level Path Planning Method for Unmanned Vehicle Based on Deep Reinforcement Learning [J]. Computer Science, 2023, 50(1): 194-204.
[3]	ZHANG Qiyang, CHEN Xiliang, ZHANG Qiao. Sparse Reward Exploration Method Based on Trajectory Perception [J]. Computer Science, 2023, 50(1): 262-269.
[4]	WEI Nan, WEI Xianglin, FAN Jianhua, XUE Yu, HU Yongyang. Backdoor Attack Against Deep Reinforcement Learning-based Spectrum Access Model [J]. Computer Science, 2023, 50(1): 351-361.
[5]	LUO Xiong-feng, ZHAI Xiang-ping. Collision Avoidance Planning for Unmanned Aerial Vehicles Based on Spatial Motion Constraints [J]. Computer Science, 2022, 49(9): 194-201.
[6]	YU Bin, LI Xue-hua, PAN Chun-yu, LI Na. Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning [J]. Computer Science, 2022, 49(7): 248-253.
[7]	LI Meng-fei, MAO Ying-chi, TU Zi-jian, WANG Xuan, XU Shu-fang. Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient [J]. Computer Science, 2022, 49(7): 271-279.
[8]	CHEN Bo-chen, TANG Wen-bing, HUANG Hong-yun, DING Zuo-hua. Pop-up Obstacles Avoidance for UAV Formation Based on Improved Artificial Potential Field [J]. Computer Science, 2022, 49(6A): 686-693.
[9]	XIE Wan-cheng, LI Bin, DAI Yue-yue. PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing [J]. Computer Science, 2022, 49(6): 3-11.
[10]	HONG Zhi-li, LAI Jun, CAO Lei, CHEN Xi-liang, XU Zhi-xiong. Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration [J]. Computer Science, 2022, 49(6): 149-157.
[11]	LI Peng, YI Xiu-wen, QI De-kang, DUAN Zhe-wen, LI Tian-rui. Heating Strategy Optimization Method Based on Deep Learning [J]. Computer Science, 2022, 49(4): 263-268.
[12]	OUYANG Zhuo, ZHOU Si-yuan, LYU Yong, TAN Guo-ping, ZHANG Yue, XIANG Liang-liang. DRL-based Vehicle Control Strategy for Signal-free Intersections [J]. Computer Science, 2022, 49(3): 46-51.
[13]	CAI Yue, WANG En-liang, SUN Zhe, SUN Zhi-xin. Study on Dual Sequence Decision-making for Trucks and Cargo Matching Based on Dual Pointer Network [J]. Computer Science, 2022, 49(11A): 210800257-9.
[14]	DAI Shan-shan, LIU Quan. Action Constrained Deep Reinforcement Learning Based Safe Automatic Driving Method [J]. Computer Science, 2021, 48(9): 235-243.
[15]	CHENG Zhao-wei, SHEN Hang, WANG Yue, WANG Min, BAI Guang-wei. Deep Reinforcement Learning Based UAV Assisted SVC Video Multicast [J]. Computer Science, 2021, 48(9): 271-277.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Real-time Trajectory Planning Algorithm Based on Collision Criticality and Deep Reinforcement Learning

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0