Computer Science ›› 2021, Vol. 48 ›› Issue (9): 235-243.doi: 10.11896/jsjkx.201000084
• Artificial Intelligence • Previous Articles Next Articles
DAI Shan-shan1, LIU Quan1,2,3,4
CLC Number:
[1]ORT T,PAULL L,RUS D.Autonomous vehicle navigation in rural environments without detailed prior maps[C]//2018 IEEE International Conference on Robotics and Automation (ICRA).IEEE,2018:2040-2047. [2]PENDLETON S D,ANDERSEN H,DU X X,et al.Perception,Planning,Control,and Coordination for Autonomous Vehicles[J].Machines,2017,5(1):6. [3]CAPORALE D,SETTIMI A,MASSA F,et al.Towards the Design of Robotic Drivers for Full-Scale Self-Driving Racing Cars [C]//2019 International Conference on Robotics and Automation (ICRA).IEEE,2019:5643-5649. [4]ZHUANG L,ZHANG Z,WANG L.The automatic segmentation of residential solar panels based on satellite images:A cross learning driven U-Net method[J].Applied Soft Computing,2020,92:106283. [5]VEDDER B,SVENSSON B J,VINTER J,et al.AutomatedTesting of Ultrawideband Positioning for Autonomous Driving[J].Journal of Robotics,2020,2020:1-15. [6]BOJARSKI M,DEL TESTA D,DWORAKOWSKI D,et al.End to End Learning for Self-Driving Cars[J].arXiv:1604.07316,2016. [7]XU H,GAO Y,YU F,et al.End-to-End Learning of Driving Models from LargeScale Video Datasets [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2017:2174-2182. [8]CHEN L,WANG Q,LU X,et al.Learning Driving ModelsFrom Parallel End-to-End Driving Data Set[J].Proceedings of the IEEE,2020,108(2):262-273. [9]CODEVILLA F,MULLER M.End-to-end driving via conditio-nal imitation learning[C]//2018 IEEE International Conference on Robotics and Automation (ICRA).IEEE,2018:4693-4700. [10]SUTTOM R S,BARTO A G.Reinforcement learning:An introduction [M].MIT Press,1998. [11]MAXIMILIAN J,RAOUL D,MARIN T,et al.End-to-EndRace Driving with Deep Reinforcement Learning[C]//International Conference on Robotics and Automation (ICRA).IEEE,2018:2070-2075. [12]KENDALL A,HAWKE J,JANZ D,et al.Learning to Drive in a Day [C]//2018 IEEE International Conference on Robotics and Automation (ICRA).IEEE,2019:8248-8254. [13]TOROMANOFF M,WIRBEL E,MOUTAR-DE F.End-to-End Model-Free Reinforcement Learning for Urban Driving Using Implicit Affordances[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:7153-7162. [14]CHEN S,WANG M,SONG W,et al.Stabilization Approaches for Reinforcement Learning-Based End-to-End Autonomous Driving[J].IEEE Transactions on Vehicular Technology,2020,69(5):4740-4750. [15]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529-533. [16]HAARNOJA T,ZHOU A,ABBEEL P,et al.Soft Actor-Critic:Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor [C]//International Conference on Machine Learning ICML.2018. [17]SHI W,SONG S,WU C.Soft Policy Gradient Method for Maximum Entropy Deep Reinforcement Learning [C]//International Joint Conference on Artificial Intelligence (IJCAI).2019. [18]ZHU F,WU W,FU Y C,et al.Security depth reinforcementlearning method based on double depth network[J].Acta Computerica,2019,42(8). [19]GARCI A J,FERNÁNDEZ F.A comprehensive survey on safe reinforcement learning[J].Journal of Machine Learning Research,2015,16(1):1437-1480. [20]GARCIA J,FERNANDEZ F.Safe Exploration of State and Action Spaces in Reinforcement Learning[J].Journal of Artificial Intelligence Research,2014,45(1). [21]BERKENKARMP F,TURCHETTA M,SCHOELLIG A P,et al.Safe model-based reinforcement learning with stability guarantees[J].arXiv:1705.08551,2017. [22]MAZUMDER S,LIU B,WANG S,et al.Action permissibility in deep reinforcement learning and application to autonomous dri-ving[C]//KDD'18 Deep Learning Day.2018. [23]LIU Q,ZHAI J W,ZHANG Z,et al.A review of deep reinforcement learning[J].Acta Computerica Sinica,2018,41(1):1-27. [24]LEE K,SAIGOL K,THEODOROU E A.Early Failure Detection of Deep End-to-End Control Policy by Reinforcement Learning[C]//2019 International Conference on Robotics and Automation (ICRA).IEEE,2019. [25]FUJIMOTO S,HOOF H,MEGER D.Addressing function approximation error in actor-critic methods[C]//International Conference on Machine Learning.PMLR,2018:1587-1596. [26]ZIEBART B D,MAAS A L,BAGNELL J A,et al.Maximum entropy inverse reinforcement learning[C]//AAAI Conference on Artificial Intelligence (AAAI).2008:1433-1438. [27]LEVINE S,FINN C,DARRELL T,et al.End-to-End Training of Deep Visuomotor Policies[J].Journal of Machine Learning Research,2015,17(1):1334-1373. [28]O'DONOGHUE B,MUNOS R,KAVUKCUOGLU K,et al.PGQ:Combining policy gradient and Q-learning[J].arXiv:1611.01626,2016. [29]NACHUM O,NOROUZI M,XU K,et al.Bridging the gap between value and policy based reinforcement learning[C]//Ad-vances in Neural Information Processing Systems (NIPS).2017:2772-2782. [30]HAARNOJA T,TANG H,ABBEEL P,et al.Reinforcementlearning with deep energy-based policies[C]//International Conference on Machine Learning (ICML).2017:1352-1361. [31]MINK J W.The basal ganglia:focused selection and inhibition of competing motor programs[J].Progress in Neurobiology,1996,50(4):381-425. [32]LIPTON Z C,AZIZZADENESHELI K,KUMAR A,et al.Combating reinforcement learning's sisyphean curse with intrinsic fear[J].arXiv:1611.01211,2016. [33]AGARWAL A,ABHINAU K V,DUNOVAN K,et al.BetterSafe than Sorry:Evidence Accumulation Allows for Safe Reinforcement Learning[J].arXiv:1809.09147,2018. [34]REN J,MCLSAAC K A,PATEL R V,et al.A potential field model using generalized sigmoid functions[J].IEEE Transactions on Systems,Man,and Cybernetics,Part B (Cybernetics),2007,37(2):477-484. [35]GOMES G S,LUDERMIR T B.Complementary log-log andprobit:activation functions implemented in artificial neural networks[C]//2008 Eighth International Conference on Hybrid Intelligent Systems.IEEE,2008:939-942. [36]SCHULMAN J,ABBEEL P,CHEN X.Equivalence betweenpolicy gradients and soft Q-learning[J].arXiv:1704.06440,2017a. [37]CHEN Z,HUANG X.End-to-end learning for lane keeping of self-driving cars[C]//2017 IEEE Intelligent Vehicles Sympo-sium (IV).IEEE,2017. [38]BADRINARAYANAN V,KENDALL A,CIPOLLA R.Seg-net:A deep convolutional encoder-decoder architecture for scene segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(12):2481-2495. [39]SILVER D,HUANG A,MADDISON C J A,et al.Masteringthe game of go with deep neural networks and tree search[J].Nature,2016,529(7587):484-489. [40]SILVER D,HUBERT T,SCHRITTWIESER I J,et al.Mastering chess and shogi by self-play with a general reinforcement learning algorithm[C]//CoRR.2017. |
[1] | YU Bin, LI Xue-hua, PAN Chun-yu, LI Na. Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning [J]. Computer Science, 2022, 49(7): 248-253. |
[2] | LI Meng-fei, MAO Ying-chi, TU Zi-jian, WANG Xuan, XU Shu-fang. Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient [J]. Computer Science, 2022, 49(7): 271-279. |
[3] | XIE Wan-cheng, LI Bin, DAI Yue-yue. PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing [J]. Computer Science, 2022, 49(6): 3-11. |
[4] | HONG Zhi-li, LAI Jun, CAO Lei, CHEN Xi-liang, XU Zhi-xiong. Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration [J]. Computer Science, 2022, 49(6): 149-157. |
[5] | ZHANG Jia-neng, LI Hui, WU Hao-lin, WANG Zhuang. Exploration and Exploitation Balanced Experience Replay [J]. Computer Science, 2022, 49(5): 179-185. |
[6] | LI Peng, YI Xiu-wen, QI De-kang, DUAN Zhe-wen, LI Tian-rui. Heating Strategy Optimization Method Based on Deep Learning [J]. Computer Science, 2022, 49(4): 263-268. |
[7] | OUYANG Zhuo, ZHOU Si-yuan, LYU Yong, TAN Guo-ping, ZHANG Yue, XIANG Liang-liang. DRL-based Vehicle Control Strategy for Signal-free Intersections [J]. Computer Science, 2022, 49(3): 46-51. |
[8] | CHENG Zhao-wei, SHEN Hang, WANG Yue, WANG Min, BAI Guang-wei. Deep Reinforcement Learning Based UAV Assisted SVC Video Multicast [J]. Computer Science, 2021, 48(9): 271-277. |
[9] | ZHOU Shi-cheng, LIU Jing-ju, ZHONG Xiao-feng, LU Can-ju. Intelligent Penetration Testing Path Discovery Based on Deep Reinforcement Learning [J]. Computer Science, 2021, 48(7): 40-46. |
[10] | LI Bei-bei, SONG Jia-rui, DU Qing-yun, HE Jun-jiang. DRL-IDS:Deep Reinforcement Learning Based Intrusion Detection System for Industrial Internet of Things [J]. Computer Science, 2021, 48(7): 47-54. |
[11] | LIANG Jun-bin, ZHANG Hai-han, JIANG Chan, WANG Tian-shu. Research Progress of Task Offloading Based on Deep Reinforcement Learning in Mobile Edge Computing [J]. Computer Science, 2021, 48(7): 316-323. |
[12] | WANG Ying-kai, WANG Qing-shan. Reinforcement Learning Based Energy Allocation Strategy for Multi-access Wireless Communications with Energy Harvesting [J]. Computer Science, 2021, 48(7): 333-339. |
[13] | FAN Yan-fang, YUAN Shuang, CAI Ying, CHEN Ruo-yu. Deep Reinforcement Learning-based Collaborative Computation Offloading Scheme in VehicularEdge Computing [J]. Computer Science, 2021, 48(5): 270-276. |
[14] | FAN Jia-kuan, WANG Hao-yue, ZHAO Sheng-yu, ZHOU Tian-yi, WANG Wei. Data-driven Methods for Quantitative Assessment and Enhancement of Open Source Contributions [J]. Computer Science, 2021, 48(5): 45-50. |
[15] | HUANG Zhi-yong, WU Hao-lin, WANG Zhuang, LI Hui. DQN Algorithm Based on Averaged Neural Network Parameters [J]. Computer Science, 2021, 48(4): 223-228. |
|