Computer Science ›› 2023, Vol. 50 ›› Issue (4): 159-171.doi: 10.11896/jsjkx.220500261
• Artificial Intelligence • Previous Articles Next Articles
YU Ze1, NING Nianwen1,4, ZHENG Yanliu2, LYU Yining1, LIU Fuqiang3, ZHOU Yi1,4
CLC Number:
[1]公安部交通管理局.《今年上半年新注册登记机动车1871万辆》[EB/OL].(2022-01-11).https://www.mps.gov.cn/n2254314/n6409334/c8322353/content.html. [2]DIAO M,KONG H,ZHAO J,et al.Impacts of transportation networkcompanies on urban mobility[J].Nature Sustainability,2021,4(6):494-500. [3]SUN H,CHEN C L,LIU Q,et al.Traffic Signal Control Me-thod Based on Deep Reinforcement Learning[J].Computer Science,2020,47(2):169-174. [4]VARAIYA P.The max-pressure controller for arbitrary net-works of signalized intersections[M].Springer:Advances in Dynamic Network Modeling in Complex Transportation Systems,2013:27-66. [5]ALI M E M,DURDU A,ÇELTEK S A,et al.An adaptivemethod for traffic signal control based on fuzzy logic with webster and modified webster formula using SUMO traffic simulator[J].IEEE Access,2021,9:102985-102997. [6]SHI Y,LI J,HAN Q,et al.A Coordination Algorithm for Signalized Multi-Intersection to Maximize Green Wave Band in V2X Network[J].IEEE Access,2020,8(3):213706-213717. [7]WEI H,CHEN C,ZHENG G,et al.Presslight:Learning maxpressure control to coordinate traffic signals in arterial network[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2019:1290-1298. [8]WEI H,ZHENG G,GAYAH V,et al.A survey on traffic signal control methods[J].arXiv:1904.08117,2019. [9]SRINIVASAN D,CHOY M C,CHEU R L.Neuralnetworks for real-time traffic signal control[J].IEEE Transactions on Intelligent Transportation Systems,2006,7(3):261-272. [10]MANANDHAR B,JOSHI B.Adaptive traffic light control with statistical multiplexing technique and particle swarm optimization insmart cities[C]//2018 IEEE 3rd International Conference on Computing,Communication and Security(ICCCS).IEEE,2018:210-217. [11]SÁNCHEZ-MEDINA J J,GALÁN-MORENO M J,RUBIO-ROYO E.Traffic signal optimization in “La Almozara” district in saragossa under congestion conditions,using genetic algorithms,traffic microsimulation,and cluster computing[J].IEEE Transactions on Intelligent Transportation Systems,2009,11(1):132-141. [12]WEI H,ZHENG G,YAO H,et al.Intellilight:A reinforcement learning approach for intelligent traffic light control[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2018:2496-2505. [13]ZHENG G,XIONG Y,ZANG X,et al.Learning phase competition for traffic signal control[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Ma-nagement.2019:1963-1972. [14]ZHENG G,ZANG X,XU N,et al.Diagnosing reinforcementlearning for traffic signal control[J].arXiv:1905.04716,2019. [15]GRONAUER S,DIEPOLDK.Multi-agent deep reinforcementlearning:a survey[J].Artificial Intelligence Review,2022,55(2):895-943. [16]TAMPUU A,MATIISEN T,KODELJA D,et al.Multiagentcooperation and competition with deep reinforcement learning[J].PloS One,2017,12(4):e0172395.https://doi.org/10.1371/journal.pone.0172395. [17]ZHANG K,YANG Z,BAŞAR T.Multi-agent reinforcementlearning:A selective overview of theories and algorithms[J].Handbook of Reinforcement Learning and Control,2021,325(2):321-384. [18]XIONG Y,ZHENG G,XU K,et al.Learning traffic signal control from demonstrations[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management.2019:2289-2292. [19]FOERSTER J,ASSAEL I A,DE FREITAS N,et al.Learning to communicate with deep multi-agent reinforcement learning[C]//Advances in Neural Information Processing Systems.2016:2145-2153. [20]TAN M.Multi-agent reinforcement learning:Independent vs.cooperative agents[C]//Proceedings of the Tenth International Conference on Lachine learning.1993:330-337. [21]RASHID T,SAMVELYAN M,SCHROEDER C,et al.Qmix:Monotonic value function factorisation for deep multi-agent reinforcement learning[C]//International Conference on Machine Learning.PMLR,2018:4295-4304. [22]IQBAL S,DE WITT CAS,PENG B,et al.Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning[C]//International Conference on Machine Learning.PMLR,2021:4596-4606. [23]PANDEY D,PANDEY P.Approximate Q-learning:An intro-duction[C]//2010 Second International Conference on Machine Learning and Computing.IEEE,2010:317-320. [24]ARULKUMARAN K,DEISENROTH M P,BRUNDAGE M,et al.Deep reinforcement learning:A brief survey[J].IEEE Signal Processing Magazine,2017,34(6):26-38. [25]LEI L,TAN Y,ZHENG K,et al.Deep reinforcement learning for autonomous internet of things:Model,applications and challenges[J].IEEE Communications Surveys & Tutorials,2020,22(3):1722-1760. [26]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529-533. [27]LIU M,DENG J,XU M,et al.Cooperative deep reinforcement learning for traffic signal control[C]//23rd ACM SIGKDD Conference on Knowledge Discovery and Data Mining(KDD).Halifax.2017. [28]SCHUTERA M,GOBY N,SMOLAREK S,et al.Distributedtraffic light control at uncoupled intersections with real-world topology by deep reinforcement learning[C]//32nd Conference on Neural Information Processing Systems,within Workshop on Machine Learning for Intelligent Transportation Systems.Canada,2018:1-9. [29]LIU X Y,DING Z,BORST S,et al.Deep reinforcement lear-ning for intelligent transportation systems[C]//32nd Confe-rence on Neural Information Processing Systems.Canada,2018. [30]PUTERMAN M L.Markov decision processes:discrete stochastic dynamic programming[M].John Wiley & Sons,2014. [31]TAN T,BAO F,DENG Y,et al.Cooperative deep reinforcement learning for large-scale traffic grid signal control[J].IEEE Transactions on Cybernetics,2019,50(6):2687-2700. [32]WU T,ZHOU P,LIU K,et al.Multi-agent deep reinforcement learning for urban traffic light control in vehicular networks[J].IEEE Transactions on Vehicular Technology,2020,69(8):8243-8256. [33]ZHAO T,WANG P,LI S.Traffic Signal Control with Deep Reinforcement Learning[C]//2019 International Conference on Intelligent Computing,Automation and Systems(ICICAS).IEEE,2019:763-767. [34]ZHANG R,ISHIKAWA A,WANG W,et al.Using reinforce-ment learning with partial vehicle detection for intelligent traffic signal control[J].IEEE Transactions on Intelligent Transportation Systems,2020,22(1):404-415. [35]CHU T,WANG J,CODECÀ L,et al.Multi-agent deep rein-forcement learning for large-scale traffic signal control[J].IEEE Transactions on Intelligent Transportation Systems,2019,21(3):1086-1095. [36]MOUSAVI S S,SCHUKAT M,HOWLEY E.Traffic light control using deep policy-gradient and value-function-based reinforcement learning[J].IET Intelligent Transport Systems,2017,11(7):417-423. [37]VAN DER POL E,OLIEHOEK F A.Coordinated deep reinforcement learners for traffic light control[C]//Proceedings of Learning,Inference and Control of Multi-agent Systems(at NIPS 2016).2016:1-8. [38]GONG Y,ABDEL-ATY M,CAI Q,et al.Decentralized network level adaptive signal control by multi-agent deep reinforcement learning[J].Transportation Research Interdisciplinary Perspectives,2019,1:100020. [39]WAN C H,HWANG M C.Value-based deep reinforcementlearning for adaptive isolated intersection signal control[J].IET Intelligent Transport Systems,2018,12(9):1005-1010. [40]ZENG J,HU J,ZHANG Y.Adaptive traffic signal control with deep recurrent Q-learning[C]//2018 IEEE Intelligent Vehicles Symposium(IV).IEEE,2018:1215-1220. [41]WEI H,CHEN C,WU K,et al.Deep reinforcement learning for traffic signal control along arterials[C]//Proceedings of the 2019.DRL4KDD,2019. [42]TAN K L,PODDAR S,SARKAR S,et al.Deep reinforcement lear-ning for adaptive traffic signal control[C]//Dynamic Systems and Control Conference.American Society of Mechanical Engineers,2019. [43]WATKINS C J C H,DAYAN P.Q-learning[J].Machine lear-ning,1992,8(3):279-292. [44]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529-533. [45]PETERS J,SCHAAL S.Reinforcement learning of motor skills with policy gradients[J].Neural Networks,2008,21(4):682-697. [46]KONDA V,TSITSIKLIS J.Actor-critic algorithms[C]//Advances in Neural Information Processing Systems.1999:1008-1014. [47]LILLICRAP T P,HUNT J J,PRITZEL A,et al.Continuouscontrol with deep reinforcement learning[C]//International Conference on Learning Representations.American,2016. [48]MONAHAN G E.State of the art-a survey of partially obser-vable Markov decision processes:theory,models,and algorithms[J].Management Science,1982,28(1):1-16. [49]EREZ T,SMART W D.A scalable method for solving high-dimensional continuous POMDPs using local approximation[C]//Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence.California,2010. [50]RITCHER S.Traffic light scheduling using policy-gradient reinforcement learning[C]//The International Conference on Automated Planning and Scheduling.ICAPS,2007. [51]CHU T,QU S,WANG J.Large-scale traffic grid signal control with regional reinforcement learning[C]//2016 American Control Conference(ACC).IEEE,2016:815-820. [52]AZIZ H M A,ZHU F,UKKUSURI S V.Learning-based traffic signal controlalgorithms with neighborhood information sharing:An application for sustainable mobility[J].Journal of Intelligent Transportation Systems,2018,22(1):40-52. [53]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780. [54]TAN M.Multi-agent reinforcement learning:Independent vs.cooperative agents[C]//Proceedings of the Tenth International Conference on Machine Learning.1993:330-337. [55]FOERSTER J,NARDELLI N,FARQUHAR G,et al.Stabili-sing experience replay for deep multi-agent reinforcement lear-ning[C]//International Conference on Machine Learning.PMLR,2017:1146-1155. [56]GUESTRIN C,KOLLER D,PARR R.Multiagent planning with factored MDPs[C]//Advances in Neural Information Processing Systems.2001,14:1523-1530. [57]KOK J R,VLASSIS N.Collaborative multiagent reinforcement learning by payoff propagation[J].Journal of Machine Learning Research,2006,7(1):1789-1828. [58]CASAS N.Deep deterministic policy gradient for urban traffic light control[J].arXiv:1703.09035,2017. [59]WANG X,KE L,QIAO Z,et al.Large-scale traffic signal control using a novel multiagent reinforcement learning[J].IEEE Transactions on Cybernetics,2020,51(1):174-187. [60]LOPEZ P A,BEHRISCH M,BIEKER-WALZ L,et al.Microscopic traffic simulation using sumo[C]//2018 21st InternationalConference on Intelligent Transportation Systems(ITSC).IEEE,2018:2575-2582. [61]ZHANG H,FENG S,LIU C,et al.Cityflow:A multi-agent reinforcement learning environment for large scale city traffic scenario[C]//The World Wide Web Conference.2019:3620-3624. [62]FELLENDORF M,VORTISCH P.Microscopic traffic flow si-mulator VISSIM[M]//Fundamentals of Traffic Simulation.Springer,New York,NY,2010:63-93. [63]CAMERON G D B,DUNCAN G I D.PARAMICS-Parallel microscopic simulation of road traffic[J].The Journal of Supercomputing,1996,10(1):25-53. [64]GRAHAM B.Spatially-sparse convolutional neural networks[J].arXiv:1409.6070,2014. [65]HUANG D,OU J,XIAO H X,et al.Collaborative optimization of traffic signal lights and vehicle fleet trajectory at intersection[J].Journal of Chongqing University of Technology(Natural Science),2022,36(4):84-93. [66]ZHOU Y,LIU L,WANG L,et al.Service-aware 6G:An intelligent and open network based on the convergence of communication,computing and caching[J].Digital Communications and Networks,2020,6(3):253-256. |
[1] | XU Linling, ZHOU Yuan, HUANG Hongyun, LIU Yang. Real-time Trajectory Planning Algorithm Based on Collision Criticality and Deep Reinforcement Learning [J]. Computer Science, 2023, 50(3): 323-332. |
[2] | Cui ZHANG, En WANG, Funing YANG, Yong jian YANG , Nan JIANG. UAV Frequency-based Crowdsensing Using Grouping Multi-agentDeep Reinforcement Learning [J]. Computer Science, 2023, 50(2): 57-68. |
[3] | HUANG Yuzhou, WANG Lisong, QIN Xiaolin. Bi-level Path Planning Method for Unmanned Vehicle Based on Deep Reinforcement Learning [J]. Computer Science, 2023, 50(1): 194-204. |
[4] | RONG Huan, QIAN Minfeng, MA Tinghuai, SUN Shengjie. Novel Class Reasoning Model Towards Covered Area in Given Image Based on InformedKnowledge Graph Reasoning and Multi-agent Collaboration [J]. Computer Science, 2023, 50(1): 243-252. |
[5] | ZHANG Qiyang, CHEN Xiliang, ZHANG Qiao. Sparse Reward Exploration Method Based on Trajectory Perception [J]. Computer Science, 2023, 50(1): 262-269. |
[6] | WEI Nan, WEI Xianglin, FAN Jianhua, XUE Yu, HU Yongyang. Backdoor Attack Against Deep Reinforcement Learning-based Spectrum Access Model [J]. Computer Science, 2023, 50(1): 351-361. |
[7] | SHI Dian-xi, ZHAO Chen-ran, ZHANG Yao-wen, YANG Shao-wu, ZHANG Yong-jun. Adaptive Reward Method for End-to-End Cooperation Based on Multi-agent Reinforcement Learning [J]. Computer Science, 2022, 49(8): 247-256. |
[8] | YU Bin, LI Xue-hua, PAN Chun-yu, LI Na. Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning [J]. Computer Science, 2022, 49(7): 248-253. |
[9] | LI Meng-fei, MAO Ying-chi, TU Zi-jian, WANG Xuan, XU Shu-fang. Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient [J]. Computer Science, 2022, 49(7): 271-279. |
[10] | XIE Wan-cheng, LI Bin, DAI Yue-yue. PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing [J]. Computer Science, 2022, 49(6): 3-11. |
[11] | HONG Zhi-li, LAI Jun, CAO Lei, CHEN Xi-liang, XU Zhi-xiong. Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration [J]. Computer Science, 2022, 49(6): 149-157. |
[12] | WANG Qi, WANG Gang-qiao, CHEN Yong-qiang, LIU Yi. Integrated Modeling Method and Application System for Social Computing [J]. Computer Science, 2022, 49(4): 25-29. |
[13] | LI Peng, YI Xiu-wen, QI De-kang, DUAN Zhe-wen, LI Tian-rui. Heating Strategy Optimization Method Based on Deep Learning [J]. Computer Science, 2022, 49(4): 263-268. |
[14] | OUYANG Zhuo, ZHOU Si-yuan, LYU Yong, TAN Guo-ping, ZHANG Yue, XIANG Liang-liang. DRL-based Vehicle Control Strategy for Signal-free Intersections [J]. Computer Science, 2022, 49(3): 46-51. |
[15] | ZHU Di-di, WU Chao. Cooperation and Confrontation in Crowd Intelligence [J]. Computer Science, 2022, 49(11A): 210900249-7. |
|