基于Q学习和动态权重的改进的区域交通信号控制方法

doi:10.11896/j.issn.1002-137X.2016.08.035

Abstract

Abstract: Q-Learning is widely used in traffic signal control.In traditional multi-agent traffic signal control policy,agents gain intersection information via network,and make the best control decision.It works well in most cases.But traditional policy has a weakness that the global reward is calculated by simple average.This may cause local block in some cases.This paper introduced a promoted area traffic signal control based on Q learning.“Intersection Weight” is used in the new calculation method,which varies dynamically according to the real traffic condition.Both traditional and promoted methods were used to experiment.The results show the advantage of the promoted one.

Key words: Q learning,Area traffic control,Intersection weight

ZHANG Chen, YU Jian and HE Liang-hua. Promoted Traffic Control Strategy Based on Q Learning and Dynamic Weight[J].Computer Science, 2016, 43(8): 171-176.

References

[1] Sutton R S,Barto A G.Introduction to reinforcement learning[M].MIT Press,1998
[2] Sutton S.Introduction:The challenge of reinforcement learning[M].Reinforcement Learning.Springer US,1992:1-3
[3] Watkins C J C H,Dayan P.Q-learning[J].Machine Learning,1992,8(3/4):279-292
[4] Bazzan A L C.An Evolutionary Game Theoretic Approach for Coordination of Traffic Signal Agents[D].University of Karlsruhe,1997
[5] Bazzan A L C.A Distributed Approach for Coordination of Traffic Signal Agents[J].Autonomous Agents and Multi-Agent Systems,2005(10):131-164
[6] Hunt P B,Robertson D I,Bretherton R D,et al.SCOOT-a traffic responsive method of coordinatingsignals[D].United Kingdom,1981
[7] Sims A G,Dobinson K W.The Sydney Coordinated AdaptiveTraffic (SCAT) system philosophy and benefits[J].IEEE Trans.Veh.Technol.,1980(29):130-137
[8] Abdulhai B,Pringle R,Karakoulas G J.Reinforcement learning for true adaptive traffic signal control[J].Journal of Transportation Engineering,2003,129(3):278-285
[9] Araghi S,Khosravi A,Johnstone M,et al.Q-learning method for controlling traffic signal phase time in a single intersection[C]∥16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013).2013:1261-1265
[10] Lu Shou-feng,Zhang Shu,Liu Xi-min.On-line Q Learning Mo-del for Minimizing Average Queue Length Difference of Single Intersection[J].Journal of Highway and Transportation Research and Development,2014,31(11):116-122(in Chinese) 卢守峰,张术,刘喜敏.平均排队长度差最小的单交叉口在线Q学习模型[J].公路交通科技,2014,1(11):116-122
[11] Prabuchandran K J,Kumar H,Bhatnagar A N,et al.Decentra-lized learning for traffic signal control[C]∥2015 7th International Conference on Communication Systems and Networks.2015:1-6
[12] Kar S,Moura J M F,Poor H V.Distributed reinforcement learning in multi-agent networks[C]∥2013 5th IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).2013:296-299
[13] Abdoos M,Mozayani N,Bazzan A L C.Traffic light control in non-stationary environments based on multi agent Q-learning[C]∥2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC).IEEE,2011:1580-1585
[14] Wiering M.Multi-Agent Reinforcement Learning for TrafficLight Control.Machine Learning[C]∥Proceedings of the Se-venteenth International Conference (ICML’2000).2000:1151-1158
[15] Wiering M,et al.Intelligent Traffic Light Control[R].Technical Report UU-CS-2004-029,University Utrecht,2004
[16] Xu Lun-hui,Xia Xin-hai,Luo Qiang.The study of reinforcement learning for traffic self-adaptive control under multiagent Markovgame environment[J].Mathematical Problems in Enginee-ring,2013,2013(6):1-10
[17] Chanloha P,Chinrungrueng J,Usaha W,et al.Cell Transmission Model-Based Multiagent Q-Learning for Network-Scale Signal Control With Transit Priority[J].Computer Journal,2014,57(3):451-468
[18] Arel I,Liu C,Urbanik T,et al.Reinforcement learning-based multi-agent system for network traffic signal control[J].Intelligent Transport Systems,IET,2010,4(2):128-135
[19] Puterman M L.Markov decision processes:discrete stochasticdynamic programming[M].John Wiley & Sons,2009
[20] Papadimitriou C H,Tsitsiklis J N.The complexity of Markovdecision processes[J].Mathematics of Operations Research,1987,12(3):441-450
[21] Rasmussen C E,Williams K I.Gaussian processes for machine learning[M].The MIT Press,2006

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Promoted Traffic Control Strategy Based on Q Learning and Dynamic Weight

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 0

Metrics

Comments

Recommended 0