计算机科学 ›› 2022, Vol. 49 ›› Issue (3): 46-51.doi: 10.11896/jsjkx.210700010
欧阳卓1, 周思源1,2, 吕勇1, 谭国平1,2, 张悦1, 项亮亮1
OUYANG Zhuo1, ZHOU Si-yuan1,2, LYU Yong1, TAN Guo-ping1,2, ZHANG Yue1, XIANG Liang-liang1
摘要: 利用深度强化学习技术实现无信号灯交叉路口车辆控制是智能交通领域的研究热点。现有研究存在无法适应自动驾驶车辆数量动态变化、训练收敛慢、训练结果只能达到局部最优等问题。文中研究在无信号灯交叉路口,自动驾驶车辆如何利用分布式深度强化方法来提升路口的通行效率。首先,提出了一种高效的奖励函数,将分布式强化学习算法应用到无信号灯交叉路口场景中,使得车辆即使无法获取整个交叉路口的状态信息,只依赖局部信息也能有效提升交叉路口的通行效率。然后,针对开放交叉路口场景中强化学习方法训练效率低的问题,使用了迁移学习的方法,将封闭的8字型场景中训练好的策略作为暖启动,在无信号灯交叉路口场景继续训练,提升了训练效率。最后,提出了一种可以适应所有自动驾驶车辆比例的策略,此策略在任意比例自动驾驶车辆的场景中均可提升交叉路口的通行效率。在仿真平台Flow上对TD3强化学习算法进行了验证,实验结果表明,改进后的算法训练收敛快,能适应自动驾驶车辆比例的动态变化,能有效提升路口的通行效率。
中图分类号:
[1]MA M,LI Z.A time-independent trajectory optimization ap-proach for connected and auto-nomous vehicles under reservation-based inte-rsection control[J].Transportation Research Interdisciplinary Perspectives,2021,9(5):100312. [2]LV P,HE Y B,XU J.An Improved Trust Evaluation Model Based on Bayesian for WSNs[J].Acta Electronica Sinica,2021,49(5):912-919. [3]RIOS -TORRES J,MALIKOPOULOS A A.Automated andCooperative Vehicle Merging at Highway On-Ramps[J].IEEE Transactions on Intelligent Transportation Systems,2016,18(4):1-10. [4]WANG Z,KIM B G,KOBAYASHI H,et al.Agent-Based Mo-deling and Simulation of Connected and Automated Vehicles Using Game Engine:A Cooperative On-Ramp Merging Study[J].arXiv:1810.09952,2018. [5]MAITLAND A,MCPHEE J.Quasi-translations for fast hybrid nonlinear model predictive control[J].Control Engineering Practice,2020,97(4):104352.1-104352.9. [6]DING J,LI L,PENG H,et al.A Rule-Based Cooperative Merging Strategy for Connected and Automated Vehicles[J].IEEE Transactions on Intelligent Transportation Systems,2020,21(8):3436-3446. [7]XIONG L,KANG Y C,ZHANG P Z,et al.Research on beha-vior decision-making system for unmanned vehicle[J].Automobile Technology,2018,515(8):1-9. [8]KAMRAN D,LOPEZ C,LAUER M,et al.Risk-aware high-level decisions for automated driving at occluded intersections with reinfor-cement learning[J].arXiv:2004.04450,2020. [9]ISELE D,RAHIMI R,COSGUN A,et al.Navigating occluded intersections with autonomous vehicles using deep reinforcement learning[C]//2018 IEEE ICRA.Brisbane:IEEE,2018:2034-2039. [10]XU G Y,ZONG X P,YU G Z,et al.A research on intelligent obstacle avoidance of unmanned vehicle based on DDPG algorithm[J].Automotive Engineering,2019,41(2):206-212. [11]ZHANG B,HE M,CHEN X L,et al.Self-driving via improved DDPG algorithm[J].Computer Engineering and Applications,2019,55(10):264-270. [12]DAI S S,LIU Q.Action Constrained Deep ReinforcementLearning Based Safe Automatic Driving Method[J].Computer Science,2021,48(9):235-243. [13]SUN C Y,MU C X.Important scientific probems of multi-agent deep reinforcement learning[J].Acta Automatica Sinica,2020,46(7):1301-1312. [14]SUN H,CHEN C L,LIU Q,et al.Constrained Deep Reinforcement Learning Based Safe A-utomatic Driving Method[J].Computer Science,2020,47(2):169-174. [15]WEI H,LIU X,MASHAYEKHY L,et al.Mixed-AutonomyTraffic Control with Proximal Policy Optimization[C]//2019 IEEE Vehicular Networking Conference (VNC).IEEE,2019. [16]VINITSKY E,LICHTLE N,PARVATE K,et al.OptimizingMixed Autonomy Traffic Flow With Decentralized Autonomous Vehicles and Multi-Agent RL[J].arXiv:2011.00120,2020. [17]CHEN D,LI Z J,WANG Y Q,et al.Deep Multi-agent Rein-forcement Learning for High-way On-Ramp Merging in Mixed Traffic[J].arXiv:2105.05701v1,2021. [18]TRAN D Q,BAE S H.Proximal Policy Optimization Through a Deep Reinforcement Learning Framework for Multiple Autonomous Vehicles at a Non-Signalized Intersection[J].Applied Sciences,2020,10(16):5722. [19]TREIBER M,HENNECKE A,HELBING D.Congested traffic states in empirical observations and microscopic simulations[J].Physical Review E,2000,62(2):1805. [20]CUI J,MACKE W,YEDIDSION H,et al.Scalable MultiagentDriving Policies For Reducing Traffic Congestion[J].arXiv:2103.00058,2021. [21]WU C,KREIDIEH A,PARVATE K,et al.Flow:A Modular Learning Framework for Autonomy in Traffic[J].arXiv:1710.05465v2,2007. [22]LIANG E,LIAW R,NISHIHARA R,et al.Ray RLLib:A Composable and Scalable Reinforcement Learning Library[J].arXiv:1712.09381,2017. [23]KRAJZEWICZ D,ERDMANN J,BEHRISCH M,et al.Recent Development and Applications of SUMO Simulation of Urban MObility[J].International Journal on Advances in Systems and Measurements,2012,12(3/4/5):128-138. |
[1] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[2] | 于滨, 李学华, 潘春雨, 李娜. 基于深度强化学习的边云协同资源分配算法 Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning 计算机科学, 2022, 49(7): 248-253. https://doi.org/10.11896/jsjkx.210400219 |
[3] | 李梦菲, 毛莺池, 屠子健, 王瑄, 徐淑芳. 基于深度确定性策略梯度的服务器可靠性任务卸载策略 Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient 计算机科学, 2022, 49(7): 271-279. https://doi.org/10.11896/jsjkx.210600040 |
[4] | 谢万城, 李斌, 代玥玥. 空中智能反射面辅助边缘计算中基于PPO的任务卸载方案 PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing 计算机科学, 2022, 49(6): 3-11. https://doi.org/10.11896/jsjkx.220100249 |
[5] | 洪志理, 赖俊, 曹雷, 陈希亮, 徐志雄. 基于遗憾探索的竞争网络强化学习智能推荐方法研究 Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration 计算机科学, 2022, 49(6): 149-157. https://doi.org/10.11896/jsjkx.210600226 |
[6] | 李鹏, 易修文, 齐德康, 段哲文, 李天瑞. 一种基于深度学习的供热策略优化方法 Heating Strategy Optimization Method Based on Deep Learning 计算机科学, 2022, 49(4): 263-268. https://doi.org/10.11896/jsjkx.210300155 |
[7] | 代珊珊, 刘全. 基于动作约束深度强化学习的安全自动驾驶方法 Action Constrained Deep Reinforcement Learning Based Safe Automatic Driving Method 计算机科学, 2021, 48(9): 235-243. https://doi.org/10.11896/jsjkx.201000084 |
[8] | 成昭炜, 沈航, 汪悦, 王敏, 白光伟. 基于深度强化学习的无人机辅助弹性视频多播机制 Deep Reinforcement Learning Based UAV Assisted SVC Video Multicast 计算机科学, 2021, 48(9): 271-277. https://doi.org/10.11896/jsjkx.201000078 |
[9] | 梁俊斌, 张海涵, 蒋婵, 王天舒. 移动边缘计算中基于深度强化学习的任务卸载研究进展 Research Progress of Task Offloading Based on Deep Reinforcement Learning in Mobile Edge Computing 计算机科学, 2021, 48(7): 316-323. https://doi.org/10.11896/jsjkx.200800095 |
[10] | 王英恺, 王青山. 能量收集无线通信系统中基于强化学习的能量分配策略 Reinforcement Learning Based Energy Allocation Strategy for Multi-access Wireless Communications with Energy Harvesting 计算机科学, 2021, 48(7): 333-339. https://doi.org/10.11896/jsjkx.201100154 |
[11] | 周仕承, 刘京菊, 钟晓峰, 卢灿举. 基于深度强化学习的智能化渗透测试路径发现 Intelligent Penetration Testing Path Discovery Based on Deep Reinforcement Learning 计算机科学, 2021, 48(7): 40-46. https://doi.org/10.11896/jsjkx.210400057 |
[12] | 李贝贝, 宋佳芮, 杜卿芸, 何俊江. DRL-IDS:基于深度强化学习的工业物联网入侵检测系统 DRL-IDS:Deep Reinforcement Learning Based Intrusion Detection System for Industrial Internet of Things 计算机科学, 2021, 48(7): 47-54. https://doi.org/10.11896/jsjkx.210400021 |
[13] | 曾伟良, 陈漪皓, 姚若愚, 廖睿翔, 孙为军. 时空图注意力网络在交叉口车辆轨迹预测的应用 Application of Spatial-Temporal Graph Attention Networks in Trajectory Prediction for Vehicles at Intersections 计算机科学, 2021, 48(6A): 334-341. https://doi.org/10.11896/jsjkx.200800066 |
[14] | 范家宽, 王皓月, 赵生宇, 周添一, 王伟. 数据驱动的开源贡献度量化评估与持续优化方法 Data-driven Methods for Quantitative Assessment and Enhancement of Open Source Contributions 计算机科学, 2021, 48(5): 45-50. https://doi.org/10.11896/jsjkx.201000107 |
[15] | 范艳芳, 袁爽, 蔡英, 陈若愚. 车载边缘计算中基于深度强化学习的协同计算卸载方案 Deep Reinforcement Learning-based Collaborative Computation Offloading Scheme in VehicularEdge Computing 计算机科学, 2021, 48(5): 270-276. https://doi.org/10.11896/jsjkx.201000005 |
|