Computer Science ›› 2023, Vol. 50 ›› Issue (4): 159-171.doi: 10.11896/jsjkx.220500261

• Artificial Intelligence • Previous Articles     Next Articles

Review of Intelligent Traffic Signal Control Strategies Driven by Deep Reinforcement Learning

YU Ze1, NING Nianwen1,4, ZHENG Yanliu2, LYU Yining1, LIU Fuqiang3, ZHOU Yi1,4   

  1. 1 School of Artificial Intelligence,Henan University,Zhengzhou 450046,China
    2 College of Computer Science and Electronic Engineering,Hunan University,Changsha 410006,China
    3 College of Electronic and Information Engineering,Tongji University,Shanghai 201804,China
    4 Shenzhen Research Institute of Henan University,Shenzhen,Guangdong 518000,China
  • Received:2022-05-28 Revised:2022-09-11 Online:2023-04-15 Published:2023-04-06
  • About author:YU Ze,born in 1998,postgraduate.His main research interests include intelligent traffic and reinforcement learning.
    NING Nianwen,born in 1991,Ph.D,lecturer.His main research interests include intelligent traffic and graph neural network.
  • Supported by:
    National Natural Science Foundation of China(62176088),Key Science and Technology Program of Henan Pro-vince,China(222102210067,222102520028) and Shenzhen Special Foundation of Central Government to Guide Local Science & Technology Deve-lopment(2021Szvup029).

Abstract: With the rapid growth of urban populations,the number of private cars has grown exponentially,which makes overwhelming traffic congestion problem become more and more acute.The traditional traffic signal control technology is difficult to adapt to the complex and changeable traffic conditions,and the data-driven methods bring new research directions for the control-based system.The combination of deep reinforcement learning and traffic control systems plays an important role in adaptive traffic signal control.First,this paper reviews the latest progress in the application of intelligent traffic signal control systems,the methods of intelligent traffic signal control are classified and discussed,and the existing works in this field are summarized.The deep reinforcement learning method can effectively solve the problems of inaccurate state information acquisition,poor algorithm robust and weak regional coordination control ability in intelligent traffic signal control.Then,on the basis of the above,this paper gives an overview of the simulation platforms and experimental setup for intelligent traffic signal control,and analyzes and verifies it through examples.Finally,The challenges and unsolved problems in this field are discussed and future research directions are summarized.

Key words: Intelligent transportation system, Deep reinforcement learning, Traffic signal control, Multi-agent

CLC Number: 

  • TP181
[1]公安部交通管理局.《今年上半年新注册登记机动车1871万辆》[EB/OL].(2022-01-11).https://www.mps.gov.cn/n2254314/n6409334/c8322353/content.html.
[2]DIAO M,KONG H,ZHAO J,et al.Impacts of transportation networkcompanies on urban mobility[J].Nature Sustainability,2021,4(6):494-500.
[3]SUN H,CHEN C L,LIU Q,et al.Traffic Signal Control Me-thod Based on Deep Reinforcement Learning[J].Computer Science,2020,47(2):169-174.
[4]VARAIYA P.The max-pressure controller for arbitrary net-works of signalized intersections[M].Springer:Advances in Dynamic Network Modeling in Complex Transportation Systems,2013:27-66.
[5]ALI M E M,DURDU A,ÇELTEK S A,et al.An adaptivemethod for traffic signal control based on fuzzy logic with webster and modified webster formula using SUMO traffic simulator[J].IEEE Access,2021,9:102985-102997.
[6]SHI Y,LI J,HAN Q,et al.A Coordination Algorithm for Signalized Multi-Intersection to Maximize Green Wave Band in V2X Network[J].IEEE Access,2020,8(3):213706-213717.
[7]WEI H,CHEN C,ZHENG G,et al.Presslight:Learning maxpressure control to coordinate traffic signals in arterial network[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2019:1290-1298.
[8]WEI H,ZHENG G,GAYAH V,et al.A survey on traffic signal control methods[J].arXiv:1904.08117,2019.
[9]SRINIVASAN D,CHOY M C,CHEU R L.Neuralnetworks for real-time traffic signal control[J].IEEE Transactions on Intelligent Transportation Systems,2006,7(3):261-272.
[10]MANANDHAR B,JOSHI B.Adaptive traffic light control with statistical multiplexing technique and particle swarm optimization insmart cities[C]//2018 IEEE 3rd International Conference on Computing,Communication and Security(ICCCS).IEEE,2018:210-217.
[11]SÁNCHEZ-MEDINA J J,GALÁN-MORENO M J,RUBIO-ROYO E.Traffic signal optimization in “La Almozara” district in saragossa under congestion conditions,using genetic algorithms,traffic microsimulation,and cluster computing[J].IEEE Transactions on Intelligent Transportation Systems,2009,11(1):132-141.
[12]WEI H,ZHENG G,YAO H,et al.Intellilight:A reinforcement learning approach for intelligent traffic light control[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2018:2496-2505.
[13]ZHENG G,XIONG Y,ZANG X,et al.Learning phase competition for traffic signal control[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Ma-nagement.2019:1963-1972.
[14]ZHENG G,ZANG X,XU N,et al.Diagnosing reinforcementlearning for traffic signal control[J].arXiv:1905.04716,2019.
[15]GRONAUER S,DIEPOLDK.Multi-agent deep reinforcementlearning:a survey[J].Artificial Intelligence Review,2022,55(2):895-943.
[16]TAMPUU A,MATIISEN T,KODELJA D,et al.Multiagentcooperation and competition with deep reinforcement learning[J].PloS One,2017,12(4):e0172395.https://doi.org/10.1371/journal.pone.0172395.
[17]ZHANG K,YANG Z,BAŞAR T.Multi-agent reinforcementlearning:A selective overview of theories and algorithms[J].Handbook of Reinforcement Learning and Control,2021,325(2):321-384.
[18]XIONG Y,ZHENG G,XU K,et al.Learning traffic signal control from demonstrations[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management.2019:2289-2292.
[19]FOERSTER J,ASSAEL I A,DE FREITAS N,et al.Learning to communicate with deep multi-agent reinforcement learning[C]//Advances in Neural Information Processing Systems.2016:2145-2153.
[20]TAN M.Multi-agent reinforcement learning:Independent vs.cooperative agents[C]//Proceedings of the Tenth International Conference on Lachine learning.1993:330-337.
[21]RASHID T,SAMVELYAN M,SCHROEDER C,et al.Qmix:Monotonic value function factorisation for deep multi-agent reinforcement learning[C]//International Conference on Machine Learning.PMLR,2018:4295-4304.
[22]IQBAL S,DE WITT CAS,PENG B,et al.Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning[C]//International Conference on Machine Learning.PMLR,2021:4596-4606.
[23]PANDEY D,PANDEY P.Approximate Q-learning:An intro-duction[C]//2010 Second International Conference on Machine Learning and Computing.IEEE,2010:317-320.
[24]ARULKUMARAN K,DEISENROTH M P,BRUNDAGE M,et al.Deep reinforcement learning:A brief survey[J].IEEE Signal Processing Magazine,2017,34(6):26-38.
[25]LEI L,TAN Y,ZHENG K,et al.Deep reinforcement learning for autonomous internet of things:Model,applications and challenges[J].IEEE Communications Surveys & Tutorials,2020,22(3):1722-1760.
[26]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529-533.
[27]LIU M,DENG J,XU M,et al.Cooperative deep reinforcement learning for traffic signal control[C]//23rd ACM SIGKDD Conference on Knowledge Discovery and Data Mining(KDD).Halifax.2017.
[28]SCHUTERA M,GOBY N,SMOLAREK S,et al.Distributedtraffic light control at uncoupled intersections with real-world topology by deep reinforcement learning[C]//32nd Conference on Neural Information Processing Systems,within Workshop on Machine Learning for Intelligent Transportation Systems.Canada,2018:1-9.
[29]LIU X Y,DING Z,BORST S,et al.Deep reinforcement lear-ning for intelligent transportation systems[C]//32nd Confe-rence on Neural Information Processing Systems.Canada,2018.
[30]PUTERMAN M L.Markov decision processes:discrete stochastic dynamic programming[M].John Wiley & Sons,2014.
[31]TAN T,BAO F,DENG Y,et al.Cooperative deep reinforcement learning for large-scale traffic grid signal control[J].IEEE Transactions on Cybernetics,2019,50(6):2687-2700.
[32]WU T,ZHOU P,LIU K,et al.Multi-agent deep reinforcement learning for urban traffic light control in vehicular networks[J].IEEE Transactions on Vehicular Technology,2020,69(8):8243-8256.
[33]ZHAO T,WANG P,LI S.Traffic Signal Control with Deep Reinforcement Learning[C]//2019 International Conference on Intelligent Computing,Automation and Systems(ICICAS).IEEE,2019:763-767.
[34]ZHANG R,ISHIKAWA A,WANG W,et al.Using reinforce-ment learning with partial vehicle detection for intelligent traffic signal control[J].IEEE Transactions on Intelligent Transportation Systems,2020,22(1):404-415.
[35]CHU T,WANG J,CODECÀ L,et al.Multi-agent deep rein-forcement learning for large-scale traffic signal control[J].IEEE Transactions on Intelligent Transportation Systems,2019,21(3):1086-1095.
[36]MOUSAVI S S,SCHUKAT M,HOWLEY E.Traffic light control using deep policy-gradient and value-function-based reinforcement learning[J].IET Intelligent Transport Systems,2017,11(7):417-423.
[37]VAN DER POL E,OLIEHOEK F A.Coordinated deep reinforcement learners for traffic light control[C]//Proceedings of Learning,Inference and Control of Multi-agent Systems(at NIPS 2016).2016:1-8.
[38]GONG Y,ABDEL-ATY M,CAI Q,et al.Decentralized network level adaptive signal control by multi-agent deep reinforcement learning[J].Transportation Research Interdisciplinary Perspectives,2019,1:100020.
[39]WAN C H,HWANG M C.Value-based deep reinforcementlearning for adaptive isolated intersection signal control[J].IET Intelligent Transport Systems,2018,12(9):1005-1010.
[40]ZENG J,HU J,ZHANG Y.Adaptive traffic signal control with deep recurrent Q-learning[C]//2018 IEEE Intelligent Vehicles Symposium(IV).IEEE,2018:1215-1220.
[41]WEI H,CHEN C,WU K,et al.Deep reinforcement learning for traffic signal control along arterials[C]//Proceedings of the 2019.DRL4KDD,2019.
[42]TAN K L,PODDAR S,SARKAR S,et al.Deep reinforcement lear-ning for adaptive traffic signal control[C]//Dynamic Systems and Control Conference.American Society of Mechanical Engineers,2019.
[43]WATKINS C J C H,DAYAN P.Q-learning[J].Machine lear-ning,1992,8(3):279-292.
[44]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529-533.
[45]PETERS J,SCHAAL S.Reinforcement learning of motor skills with policy gradients[J].Neural Networks,2008,21(4):682-697.
[46]KONDA V,TSITSIKLIS J.Actor-critic algorithms[C]//Advances in Neural Information Processing Systems.1999:1008-1014.
[47]LILLICRAP T P,HUNT J J,PRITZEL A,et al.Continuouscontrol with deep reinforcement learning[C]//International Conference on Learning Representations.American,2016.
[48]MONAHAN G E.State of the art-a survey of partially obser-vable Markov decision processes:theory,models,and algorithms[J].Management Science,1982,28(1):1-16.
[49]EREZ T,SMART W D.A scalable method for solving high-dimensional continuous POMDPs using local approximation[C]//Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence.California,2010.
[50]RITCHER S.Traffic light scheduling using policy-gradient reinforcement learning[C]//The International Conference on Automated Planning and Scheduling.ICAPS,2007.
[51]CHU T,QU S,WANG J.Large-scale traffic grid signal control with regional reinforcement learning[C]//2016 American Control Conference(ACC).IEEE,2016:815-820.
[52]AZIZ H M A,ZHU F,UKKUSURI S V.Learning-based traffic signal controlalgorithms with neighborhood information sharing:An application for sustainable mobility[J].Journal of Intelligent Transportation Systems,2018,22(1):40-52.
[53]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[54]TAN M.Multi-agent reinforcement learning:Independent vs.cooperative agents[C]//Proceedings of the Tenth International Conference on Machine Learning.1993:330-337.
[55]FOERSTER J,NARDELLI N,FARQUHAR G,et al.Stabili-sing experience replay for deep multi-agent reinforcement lear-ning[C]//International Conference on Machine Learning.PMLR,2017:1146-1155.
[56]GUESTRIN C,KOLLER D,PARR R.Multiagent planning with factored MDPs[C]//Advances in Neural Information Processing Systems.2001,14:1523-1530.
[57]KOK J R,VLASSIS N.Collaborative multiagent reinforcement learning by payoff propagation[J].Journal of Machine Learning Research,2006,7(1):1789-1828.
[58]CASAS N.Deep deterministic policy gradient for urban traffic light control[J].arXiv:1703.09035,2017.
[59]WANG X,KE L,QIAO Z,et al.Large-scale traffic signal control using a novel multiagent reinforcement learning[J].IEEE Transactions on Cybernetics,2020,51(1):174-187.
[60]LOPEZ P A,BEHRISCH M,BIEKER-WALZ L,et al.Microscopic traffic simulation using sumo[C]//2018 21st InternationalConference on Intelligent Transportation Systems(ITSC).IEEE,2018:2575-2582.
[61]ZHANG H,FENG S,LIU C,et al.Cityflow:A multi-agent reinforcement learning environment for large scale city traffic scenario[C]//The World Wide Web Conference.2019:3620-3624.
[62]FELLENDORF M,VORTISCH P.Microscopic traffic flow si-mulator VISSIM[M]//Fundamentals of Traffic Simulation.Springer,New York,NY,2010:63-93.
[63]CAMERON G D B,DUNCAN G I D.PARAMICS-Parallel microscopic simulation of road traffic[J].The Journal of Supercomputing,1996,10(1):25-53.
[64]GRAHAM B.Spatially-sparse convolutional neural networks[J].arXiv:1409.6070,2014.
[65]HUANG D,OU J,XIAO H X,et al.Collaborative optimization of traffic signal lights and vehicle fleet trajectory at intersection[J].Journal of Chongqing University of Technology(Natural Science),2022,36(4):84-93.
[66]ZHOU Y,LIU L,WANG L,et al.Service-aware 6G:An intelligent and open network based on the convergence of communication,computing and caching[J].Digital Communications and Networks,2020,6(3):253-256.
[1] XU Linling, ZHOU Yuan, HUANG Hongyun, LIU Yang. Real-time Trajectory Planning Algorithm Based on Collision Criticality and Deep Reinforcement Learning [J]. Computer Science, 2023, 50(3): 323-332.
[2] Cui ZHANG, En WANG, Funing YANG, Yong jian YANG , Nan JIANG. UAV Frequency-based Crowdsensing Using Grouping Multi-agentDeep Reinforcement Learning [J]. Computer Science, 2023, 50(2): 57-68.
[3] HUANG Yuzhou, WANG Lisong, QIN Xiaolin. Bi-level Path Planning Method for Unmanned Vehicle Based on Deep Reinforcement Learning [J]. Computer Science, 2023, 50(1): 194-204.
[4] RONG Huan, QIAN Minfeng, MA Tinghuai, SUN Shengjie. Novel Class Reasoning Model Towards Covered Area in Given Image Based on InformedKnowledge Graph Reasoning and Multi-agent Collaboration [J]. Computer Science, 2023, 50(1): 243-252.
[5] ZHANG Qiyang, CHEN Xiliang, ZHANG Qiao. Sparse Reward Exploration Method Based on Trajectory Perception [J]. Computer Science, 2023, 50(1): 262-269.
[6] WEI Nan, WEI Xianglin, FAN Jianhua, XUE Yu, HU Yongyang. Backdoor Attack Against Deep Reinforcement Learning-based Spectrum Access Model [J]. Computer Science, 2023, 50(1): 351-361.
[7] SHI Dian-xi, ZHAO Chen-ran, ZHANG Yao-wen, YANG Shao-wu, ZHANG Yong-jun. Adaptive Reward Method for End-to-End Cooperation Based on Multi-agent Reinforcement Learning [J]. Computer Science, 2022, 49(8): 247-256.
[8] YU Bin, LI Xue-hua, PAN Chun-yu, LI Na. Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning [J]. Computer Science, 2022, 49(7): 248-253.
[9] LI Meng-fei, MAO Ying-chi, TU Zi-jian, WANG Xuan, XU Shu-fang. Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient [J]. Computer Science, 2022, 49(7): 271-279.
[10] XIE Wan-cheng, LI Bin, DAI Yue-yue. PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing [J]. Computer Science, 2022, 49(6): 3-11.
[11] HONG Zhi-li, LAI Jun, CAO Lei, CHEN Xi-liang, XU Zhi-xiong. Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration [J]. Computer Science, 2022, 49(6): 149-157.
[12] WANG Qi, WANG Gang-qiao, CHEN Yong-qiang, LIU Yi. Integrated Modeling Method and Application System for Social Computing [J]. Computer Science, 2022, 49(4): 25-29.
[13] LI Peng, YI Xiu-wen, QI De-kang, DUAN Zhe-wen, LI Tian-rui. Heating Strategy Optimization Method Based on Deep Learning [J]. Computer Science, 2022, 49(4): 263-268.
[14] OUYANG Zhuo, ZHOU Si-yuan, LYU Yong, TAN Guo-ping, ZHANG Yue, XIANG Liang-liang. DRL-based Vehicle Control Strategy for Signal-free Intersections [J]. Computer Science, 2022, 49(3): 46-51.
[15] ZHU Di-di, WU Chao. Cooperation and Confrontation in Crowd Intelligence [J]. Computer Science, 2022, 49(11A): 210900249-7.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!