Computer Science ›› 2021, Vol. 48 ›› Issue (4): 223-228.doi: 10.11896/jsjkx.200600177
• Artificial Intelligence • Previous Articles Next Articles
HUANG Zhi-yong, WU Hao-lin, WANG Zhuang, LI Hui
CLC Number:
[1]SUTTON R,BARTO A.Reinforcement learning:An introduc-tion[M].Massachusetts:MIT Press,2018. [2]GAO Y,ZHOU R I,WANG H,et al.Research on average reward reinforcement learning algorithm [J].Chinese Journal of Computers,2007,30(8):1372-1378. [3]YANG W C,ZHANG L.Multi-agent reinforcement learningbased traffic signal control for integrated urban network:survey of state of art [J].Application Research of Computers,2018,35(6):1613-1618. [4]TAN M.Multi-agent reinforcement learning:independent vs.cooperative agents [C]//Proceeding of the 10th International Conference on Machine Learning.San Francisco,CA:Morgan Kaufmann Publishing,1993:487-494. [5]PETER D.Q-learning [J].Machine Learning,1992,8(3):279-292. [6]MABU S,HATAKEYAMA H,HIRASAWA K,et al.Genetic Network Programming with Reinforcement Learning Using Sarsa Algorithm [C]//IEEE Congress on Evolutionary Computation.IEEE,2006. [7]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Playing Atari with deep reinforcement learning [J].arXiv:1312.5602v1,2013. [8]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control through deep reinforcement learning [J].Nature,2015,518(7540):529-533. [9]VAN H V,GUEZ A,SILVER D.Deep reinforcement learning with double Q-learning [C]//Proceeding of the AAAI Conference on Artificial Intelligence.2016:2094-2100. [10]SCHAUL T,QUAN J,ANTONOGLOU I,et al.Prioritized experience replay [C]//Proceeding of International Conference on Learning Representations.2016:53-73. [11]WANG Z Y,SCHAUL T,HESSEL M,et al.Dueling networkarchitectures for deep reinforcement learning [C]//Proceeding of the 33rd International Conference on Machine Learning.2016:1995-2003. [12]OSBAND I,BLUNDELL C,PRITZEL A,et al.Deep exploration via bootstrapped DQN [J].arXiv:1602.04621v3,2016. [13]PLAPPERT M,HOUTHOOFT R,DHARIWAL P,et al.Pa-rameter space noise for exploration[J/OL].https://arxiv.org/abs/1706.01905. [14]LIU Q,YAN Y,ZHU F,et al.A deep recurrent Q Network with exploratory noise [J].Chinese Journal of Computers,2019(7):1588-1604. [15]YANG M,WANG J.A Bayesian deep reinforcement learning algorithm for solving deep exploration problems [J].Journal of Frontiers of Computer Science and Technology,2020,14(2):307-316. [16]BELLEMARE M G,NADDAF Y,VENESS J,et al.The arcade learning environment:an evaluation platform for general agents [J].Journal of Artificial Intelligence Research,2013,47:253-279. [17]SAMUEL CASTRO P,MOITRA S,GELADA C,et al.Dopa-mine:a research framework for deep reinforcement learning [J/OL].https://arxiv.org/abs/1812.06110. [18]LIU Q,ZHAI J W,ZHONG S,et al.A deep recurrent q-network based on visual attention mechanism [J].Chinese Journal of Computers,2017,40(6):1353-1366. [19]SCHULMAN J,WOLSKI F,DHARIWAL P,et al.Proximal policy optimization algorithms [J/OL].https://arxiv.org/abs/1707.06347. |
[1] | YU Bin, LI Xue-hua, PAN Chun-yu, LI Na. Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning [J]. Computer Science, 2022, 49(7): 248-253. |
[2] | LI Meng-fei, MAO Ying-chi, TU Zi-jian, WANG Xuan, XU Shu-fang. Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient [J]. Computer Science, 2022, 49(7): 271-279. |
[3] | XIE Wan-cheng, LI Bin, DAI Yue-yue. PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing [J]. Computer Science, 2022, 49(6): 3-11. |
[4] | HONG Zhi-li, LAI Jun, CAO Lei, CHEN Xi-liang, XU Zhi-xiong. Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration [J]. Computer Science, 2022, 49(6): 149-157. |
[5] | LI Peng, YI Xiu-wen, QI De-kang, DUAN Zhe-wen, LI Tian-rui. Heating Strategy Optimization Method Based on Deep Learning [J]. Computer Science, 2022, 49(4): 263-268. |
[6] | OUYANG Zhuo, ZHOU Si-yuan, LYU Yong, TAN Guo-ping, ZHANG Yue, XIANG Liang-liang. DRL-based Vehicle Control Strategy for Signal-free Intersections [J]. Computer Science, 2022, 49(3): 46-51. |
[7] | DAI Shan-shan, LIU Quan. Action Constrained Deep Reinforcement Learning Based Safe Automatic Driving Method [J]. Computer Science, 2021, 48(9): 235-243. |
[8] | CHENG Zhao-wei, SHEN Hang, WANG Yue, WANG Min, BAI Guang-wei. Deep Reinforcement Learning Based UAV Assisted SVC Video Multicast [J]. Computer Science, 2021, 48(9): 271-277. |
[9] | ZHOU Shi-cheng, LIU Jing-ju, ZHONG Xiao-feng, LU Can-ju. Intelligent Penetration Testing Path Discovery Based on Deep Reinforcement Learning [J]. Computer Science, 2021, 48(7): 40-46. |
[10] | LI Bei-bei, SONG Jia-rui, DU Qing-yun, HE Jun-jiang. DRL-IDS:Deep Reinforcement Learning Based Intrusion Detection System for Industrial Internet of Things [J]. Computer Science, 2021, 48(7): 47-54. |
[11] | LIANG Jun-bin, ZHANG Hai-han, JIANG Chan, WANG Tian-shu. Research Progress of Task Offloading Based on Deep Reinforcement Learning in Mobile Edge Computing [J]. Computer Science, 2021, 48(7): 316-323. |
[12] | WANG Ying-kai, WANG Qing-shan. Reinforcement Learning Based Energy Allocation Strategy for Multi-access Wireless Communications with Energy Harvesting [J]. Computer Science, 2021, 48(7): 333-339. |
[13] | FAN Jia-kuan, WANG Hao-yue, ZHAO Sheng-yu, ZHOU Tian-yi, WANG Wei. Data-driven Methods for Quantitative Assessment and Enhancement of Open Source Contributions [J]. Computer Science, 2021, 48(5): 45-50. |
[14] | FAN Yan-fang, YUAN Shuang, CAI Ying, CHEN Ruo-yu. Deep Reinforcement Learning-based Collaborative Computation Offloading Scheme in VehicularEdge Computing [J]. Computer Science, 2021, 48(5): 270-276. |
[15] | LI Li, ZHENG Jia-li, LUO Wen-cong, QUAN Yi-xuan. RFID Indoor Positioning Algorithm Based on Proximal Policy Optimization [J]. Computer Science, 2021, 48(4): 274-281. |
|