计算机科学 ›› 2023, Vol. 50 ›› Issue (1): 194-204.doi: 10.11896/jsjkx.220500241
黄昱洲, 王立松, 秦小麟
HUANG Yuzhou, WANG Lisong, QIN Xiaolin
摘要: 随着智能无人小车的广泛应用,智能化导航、路径规划和避障技术成为了重要的研究内容。文中提出了基于无模型的DDPG和SAC深度强化学习算法,利用环境信息循迹至目标点,躲避静态与动态的障碍物并且使其普适于不同环境。通过全局规划和局部避障相结合的方式,该方法以更好的全局性与鲁棒性解决路径规划问题,以更好的动态性与泛化性解决避障问题,并缩短了迭代时间;在网络训练阶段结合PID和A*等传统算法,提高了所提方法的收敛速度和稳定性。最后,在机器人操作系统ROS和仿真程序gazebo中设计了导航和避障等多种实验场景,仿真实验结果验证了所提出的兼顾问题全局性和动态性的方法具有可靠性,生成的路径和时间效率有所优化。
中图分类号:
[1]CAI K,WANG C,CHENG J,et al.Mobile robot path planning in dynamic environments:a survey[J].arXiv:2006.14195,2020. [2]ZHANG H,WANG Y,YI J F,et al.Research on intelligent robot systems for emergency prevention and control of major pandemics[J].Scientia Sinica Informationis,2020,50(7):1069-1090. [3]ZHANG H,LIN W,CHEN A.Path planning for the mobile robot:A review[J].Symmetry,2018,10(10):450-466. [4]XU X,CAI P,AHMED Z,et al.Path planning and dynamic collision avoidance algorithm under COLREGs via deep reinforcement learning[J/OL].Neurocomputing,2022,468:181-197.https://doi.org/10.1016/j.neucom.2021.09.071. [5]LI W,CHEN D,LE J.Robot patrol path planning based on combined deep reinforcement learning[C]//2018 IEEE Intl. Conf. on Parallel & Distributed Processing with Applications,Ubiquitous Computing & Communications,Big Data & Cloud Computing,Social Computing & Networking,Sustainable Computing &Communications(ISPA/IUCC/BDCloud/SocialCom/SustainCom).IEEE,2018:659-666. [6]GAO J,YE W,GUO J,et al.Deep reinforcement learning for indoor mobile robot path planning[J].Sensors,2020,20(19):5493-5507. [7]PFEIFFER M,SCHAEUBLE M,NIETO J,et al.From perception to decision:A data-driven approach to end-to-end motion planning for autonomous ground robots[C]//2017 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2017:1527-1533. [8]TAI L,PAOLO G,LIU M,et al.Virtual-to-real deep reinforcement learning:Continuous control of mobile robots for mapless navigation[C]//2017 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).IEEE,2017:31-36. [9]HAN S H,CHOI H J,BENZ P,et al.Sensor-based mobile robot navigation via deep reinforcement learning[C]//2018 IEEE International Conference on Big Data and Smart Computing(BigComp).IEEE,2018:147-154. [10]LING F,JIMENEZ-RODRIGUEZ A,PRESCOTT T J.Obstacle Avoidance Using Stereo Vision and Deep Reinforcement Lear-ning in an Animal-like Robot[C]//2019 IEEE International Conference on Robotics and Biomimetics(ROBIO).IEEE,2019:71-76. [11]XIE L,WANG S,MARKHAM A,et al.Towards monocular vision based obstacle avoidance through deep reinforcement lear-ning[J].arXiv:1706.09829,2017. [12]XIE L,WANG S,ROSA S,et al.Learning with training wheels:speeding up training with a simple controller for deep reinforcement learning[C]//2018 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2018:6276-6283. [13]KULHÁNEK J,DERNER E,DE BRUIN T,et al.Vision-based navigation using deep reinforcement learning[C]//2019 Euro-pean Conference on Mobile Robots(ECMR).IEEE,2019:1-8. [14]KÄSTNER L,SHEN Z,MARX C,et al..Autonomous Navigation in Complex Environments using Memory-Aided Deep Reinforcement Learning[C]//2021 IEEE/SICE International Symposium on System Integration(SII).IEEE,2021:170-175. [15]LUONG M,PHAM C.Incremental learning for autonomousnavigation of mobile robots based on deep reinforcement lear-ning[J].Journal of Intelligent & Robotic Systems,2021,101(1):1-11. [16]FAN T,CHENG X,PAN J,et al.Crowdmove:Autonomousmapless navigation in crowded scenarios[J].arXiv:1807.07870,2018. [17]KÄSTNER L,ZHAO X,BUIYAN T,et al.Connecting Deep-Reinforcement-Learning-based Obstacle Avoidance with Conventional Global Planners using Waypoint Generators[C]//2021 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).IEEE,2021:1213-1220. [18]CIMURS R,SUH I H,LEE J H.Goal-driven autonomous mapping through deep reinforcement learning and planning-based navigation[J].arXiv:2103.07119,2021. [19]GULDENRING R,GÖRNER M,HENDRICH N,et al.Lear-ning local planners for human-aware navigation in indoor environments[C]//2020 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).IEEE,2020:6053-6060. [20]GAO P,LIU Z,WU Z,et al.A global path planning algorithm for robots using reinforcement learning[C]//2019 IEEE International Conference on Robotics and Biomimetics(ROBIO).IEEE,2019:1693-1698. [21]SICHKAR V N.Reinforcement learning algorithms in globalpath planning for mobile robot[C]//2019 International Confe-rence on Industrial Engineering,Applications and Manufacturing(ICIEAM).IEEE,2019:1-5. [22]PANOV A I,YAKOVLEV K S,SUVOROV R.Grid path planning with deep reinforcement learning:Preliminary results[J].Procedia Computer Science,2018,123:347-353. [23]MOERLAND T M,BROEKENS J,JONKER C M.Model-based reinforcement learning:A survey[J].:arXiv:2006.16712,2020. [24]LIU R,NAGEOTTE F,ZANNE P,et al.Deep reinforcementlearning for the control of robotic manipulation:a focussed mini-review[J].Robotics,2021,10(1):22-34. [25]NGUYEN H,LA H.Review of deep reinforcement learning for robot manipulation[C]//2019 Third IEEE International Confe-rence on Robotic Computing(IRC).IEEE,2019:590-595. [26]CHEN W,QIU X,CAI T,et al.Deep reinforcement learning for Internet of Things:A comprehensive survey[J].IEEE Communications Surveys & Tutorials,2021,23(3):1659-1692. [27]LI D,OKHRIN O.DDPG car-following model with real-worldhuman driving experience in CARLA[J].arXiv:2112.14602,2021. [28]DE JESUS J C,KICH V A,KOLLING A H,et al.Soft Actor-Critic for Navigation of Mobile Robots[J].Journal of Intelligent &Robotic Systems,2021,102(2):1-11. |
[1] | 徐平安, 刘全. 基于相似度约束的双策略蒸馏深度强化学习方法 Deep Reinforcement Learning Based on Similarity Constrained Dual Policy Distillation 计算机科学, 2023, 50(1): 253-261. https://doi.org/10.11896/jsjkx.211100167 |
[2] | 张启阳, 陈希亮, 张巧. 基于轨迹感知的稀疏奖励探索方法 Sparse Reward Exploration Method Based on Trajectory Perception 计算机科学, 2023, 50(1): 262-269. https://doi.org/10.11896/jsjkx.220700010 |
[3] | 魏楠, 魏祥麟, 范建华, 薛羽, 胡永扬. 面向频谱接入深度强化学习模型的后门攻击方法 Backdoor Attack Against Deep Reinforcement Learning-based Spectrum Access Model 计算机科学, 2023, 50(1): 351-361. https://doi.org/10.11896/jsjkx.220800269 |
[4] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[5] | 王兵, 吴洪亮, 牛新征. 基于改进势场法的机器人路径规划 Robot Path Planning Based on Improved Potential Field Method 计算机科学, 2022, 49(7): 196-203. https://doi.org/10.11896/jsjkx.210500020 |
[6] | 于滨, 李学华, 潘春雨, 李娜. 基于深度强化学习的边云协同资源分配算法 Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning 计算机科学, 2022, 49(7): 248-253. https://doi.org/10.11896/jsjkx.210400219 |
[7] | 李梦菲, 毛莺池, 屠子健, 王瑄, 徐淑芳. 基于深度确定性策略梯度的服务器可靠性任务卸载策略 Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient 计算机科学, 2022, 49(7): 271-279. https://doi.org/10.11896/jsjkx.210600040 |
[8] | 杨浩雄, 高晶, 邵恩露. 考虑一单多品的外卖订单配送时间的带时间窗的车辆路径问题 Vehicle Routing Problem with Time Window of Takeaway Food ConsideringOne-order-multi-product Order Delivery 计算机科学, 2022, 49(6A): 191-198. https://doi.org/10.11896/jsjkx.210400005 |
[9] | 陈博琛, 唐文兵, 黄鸿云, 丁佐华. 基于改进人工势场的未知障碍物无人机编队避障 Pop-up Obstacles Avoidance for UAV Formation Based on Improved Artificial Potential Field 计算机科学, 2022, 49(6A): 686-693. https://doi.org/10.11896/jsjkx.210500194 |
[10] | 谭任深, 徐龙博, 周冰, 荆朝霞, 黄向生. 海上风电场通用运维路径规划模型优化及仿真 Optimization and Simulation of General Operation and Maintenance Path Planning Model for Offshore Wind Farms 计算机科学, 2022, 49(6A): 795-801. https://doi.org/10.11896/jsjkx.210400300 |
[11] | 谢万城, 李斌, 代玥玥. 空中智能反射面辅助边缘计算中基于PPO的任务卸载方案 PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing 计算机科学, 2022, 49(6): 3-11. https://doi.org/10.11896/jsjkx.220100249 |
[12] | 洪志理, 赖俊, 曹雷, 陈希亮, 徐志雄. 基于遗憾探索的竞争网络强化学习智能推荐方法研究 Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration 计算机科学, 2022, 49(6): 149-157. https://doi.org/10.11896/jsjkx.210600226 |
[13] | 李鹏, 易修文, 齐德康, 段哲文, 李天瑞. 一种基于深度学习的供热策略优化方法 Heating Strategy Optimization Method Based on Deep Learning 计算机科学, 2022, 49(4): 263-268. https://doi.org/10.11896/jsjkx.210300155 |
[14] | 欧阳卓, 周思源, 吕勇, 谭国平, 张悦, 项亮亮. 基于深度强化学习的无信号灯交叉路口车辆控制 DRL-based Vehicle Control Strategy for Signal-free Intersections 计算机科学, 2022, 49(3): 46-51. https://doi.org/10.11896/jsjkx.210700010 |
[15] | 沈彪, 沈立炜, 李弋. 空间众包任务的路径动态调度方法 Dynamic Task Scheduling Method for Space Crowdsourcing 计算机科学, 2022, 49(2): 231-240. https://doi.org/10.11896/jsjkx.210400249 |
|