基于改进深度强化学习方法的单交叉口信号控制

doi:10.11896/jsjkx.200300021

Computer Science ›› 2020, Vol. 47 ›› Issue (12): 226-232.doi: 10.11896/jsjkx.200300021

Previous Articles Next Articles

Signal Control of Single Intersection Based on Improved Deep Reinforcement Learning Method

LIU Zhi, CAO Shi-peng, SHEN Yang, YANG Xi

College of Computer Science and Technology Zhejiang University of Technology Hangzhou 310023,China

Received:2020-03-03 Revised:2020-05-10 Published:2020-12-17
About author:LIU Zhi,born in 1969Ph.Dprofessoris a member of China Computer Federation.Her main research interests include intelligent transportation and image processing.
YANG Xi,born in 1982Ph.Dassociate professor.His main research interests include control and optimization theoryintelligent transportation systems.
Supported by:
Public Welfare Technology Research Project of Zhejiang Province,China(LGG20F030008) and Natural Science Foundation of Zhejiang Province,China(LY20F030018).

Abstract

Abstract: Using deep reinforcement learning Technology to achieve signal control is a researches hot spot in the field of intelligent transportation.Existing researches mainly focus on the comprehensive description of traffic conditions based on reinforcement learning formulation and the design of effective reinforcement learning algorithms to solve the signal timing problem.Howeverthe influence of signal state on action selection and the efficiency of data sampling in the experience pool are lack of considerationswhich may result in unstable training process and slow convergence of the algorithm.This paper incorporates the signal state into the state design of the agent modeland introduces action reward and punishment coefficients to adjust the agent's action selection in order to meet the constraints of the minimum and maximum green light time.Meanwhileconsidering the temporal correlation of short-term traffic flowthe PSER (Priority Sequence Experience Replay) method is used to update the priorities of sequence samples in the experience pool.It facilitates the agent to obtain the preorder correlation samples with higher matching degree corresponding to traffic conditions.Then the double deep Q network and dueling deep Q network are used to improve the performance of DQN (Deep Q Network) algorithm.Finallytaking the single intersection of Shixinzhong Road and Shanyin RoadXiaoshan DistrictHangzhouas an examplethe algorithm is verified on the simulation platform SUMO (Simulation of Urban Mobility).Experimental results show that the proposed agent model outperforms the unconstrained single-state agent models for traffic signal control problemsand the algorithm proposed in the paper can effectively reduce the average waiting time of vehicles and total queue length at the intersection.The general control performance is better than the actual signal timing strategy and the traditional DQN algorithm.

Key words: Action reward and punishment coefficient, Deep Q Network, Priority sequence experience replay, Signal control, Weighted multi-index coefficient

CLC Number:

TP181

LIU Zhi, CAO Shi-peng, SHEN Yang, YANG Xi. Signal Control of Single Intersection Based on Improved Deep Reinforcement Learning Method[J].Computer Science, 2020, 47(12): 226-232.

References

[1] HUO Y S.A Summary of Traffic Signal Control Method Based on Reinforcement Learning[C]//The 12th Annual Conference of China Intelligent Transportation.2017:858-865.
[2] SUN H,CHEN C L,LIU Q,et al.Traffic Signal Control Me-thod Based on Deep Reinforcement Learning[J].Computer Science,2020,47(2):169-174.
[3] ZENG J,HU J,ZHANG Y.Adaptive Traffic Signal Controlwith Deep Recurrent Q-learning[C]//IEEE Intelligent Vehicles Symposium.2018:1215-1220.
[4] GAO J,SHEN Y,LIU J,et al.Adaptive Traffic Signal Control:Deep Reinforcement Learning Algorithm with Experience Replay and Target Network[J].arXiv:1705.02755,2017.
[5] GENDERS W,RAZAVI S.Using a Deep Reinforcement Learning Agent for Traffic Signal Control[J].arXiv:1611.01142,2016.
[6] WAN C H,HWANG M C.Value-based deep reinforcementlearning for adaptive isolated intersection signal control[J].IET Intelligent Transport System,2018,12(9):1005-1010.
[7] MATTHEW M,LIPING F,GUANGGYUAN P.AdaptiveTraffic Signal Control with Deep Reinforcement Learning- An Exploratory Investigation[C]//Transportation Research Board 97th Annual Meeting.2019:18-33.
[8] LI L,LYU Y,WANG F Y,et al.Traffic Signal Timing via Deep Reinforcement Learning[J].IEEE/CAA Journal of Automatic Sinica,2016,3(3):247-254.
[9] LIANG X,DU X,WANG G,et al.A Deep ReinforcementLearning Network for Traffic Light Cycle Control[J].IEEE Transactions on Vehicular Technology ,2019,68(2):1243-1253.
[10] BRITAIN M,BERTRAM J,YANG X,et al.Prioritized Se-quence Experience Replay[J].arXiv:1905.12726,2019.
[11] YAU K,QADIR J,KHOO H,et al.A Survey on Reinforcement Learning Models and Algorithms for Traffic Signal Control[J].ACM Computing Surveys,2017,50(3):1-38.
[12] ASLANI M,SEIPEL S,SAADI M,et al.Traffic signal optimization through discrete and continuous reinforcement learning with robustness analysis in downtown Tehhran[J].Advanced Engineering Informatics,2018,38:639-655.
[13] ADULHAI B,PRINGLE R,KARAKOULAS G.Reinforcement learning for true adaptive traffic signal control[J].Journal of Transportation Engineering,2003,129(3):278-285.
[14] THROPE T L,ANDERSON C W.Traffic light control usingSARSA with three state representations[R].Technical Report,IBM Corporation,1996.
[15] EI-TANTAWY S,ABDULHAI B,ABDELGA-WAD H.Design of Reinforcement Learning Parameters for Seamless Application of Adaptive Traffic Signal Control[J].Journal of Intelligent Transportation Systems,2014,18(3):227-245.
[16] LAI J H.Traffic Signal Control based on Double Deep Q-learning Network with Dueling Architecture[J].Computer Science,2019,46(S2):117-121.
[17] WANG Z,SCHAUL T,HESSEL M,et al.Dueling Network Ar-chitectures for Deep Reinforcement Learning[C]//Proceeding of the 33rd International Conference on Machine Learning.2016:1995-2003.
[18] VAN HASSELT H,GUEZ A,SILVER D.Deep Reinforcement Learning with Double Q-learning[C]//Association for the Advance of Artificial Intelligence.2016:2094-2100.
[19] SCHAUL T,QUAN J,ANTONOGLOU I,et al.Prioritized Experience Replay[C]//Proceedings of the 4th International Conference on Learning Representations.2016:322-355.
[20] FOERSTER J N,ASSAEL Y M,DE FREITAS N,et al.Learning to Communicate with Deep Multi-Agent Reinforcement Learning[C]//29th Neural Information Processing Systems.2016:10-22.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Signal Control of Single Intersection Based on Improved Deep Reinforcement Learning Method

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 3

Metrics

Comments

Recommended 0

[1]	MA Yu-yin, ZHENG Wan-bo, MA Yong, LIU Hang, XIA Yun-ni, GUO Kun-yin, CHEN Peng, LIU Cheng-wu. Multi-workflow Offloading Method Based on Deep Reinforcement Learning and ProbabilisticPerformance-awarein Edge Computing Environment [J]. Computer Science, 2021, 48(1): 40-48.
[2]	SUN Hao,CHEN Chun-lin,LIU Qiong,ZHAO Jia-bao. Traffic Signal Control Method Based on Deep Reinforcement Learning [J]. Computer Science, 2020, 47(2): 169-174.
[3]	LAI Jian-hui. Traffic Signal Control Based on Double Deep Q-learning Network with Dueling Architecture [J]. Computer Science, 2019, 46(11A): 117-121.