Computer Science ›› 2021, Vol. 48 ›› Issue (7): 40-46.doi: 10.11896/jsjkx.210400057
Special Issue: Artificial Intelligence Security
• Artificial Intelligence Security • Previous Articles Next Articles
ZHOU Shi-cheng, LIU Jing-ju, ZHONG Xiao-feng, LU Can-ju
CLC Number:
[1]XIONG Y.Design and Implementation of Automatic Penetration Testing Platform[D].Beijing:Beijing University of Posts and Telecommunications,2019. [2]BERNER C,BROCKMAN G,CHAN B,et al.Dota 2 with large scale deep reinforcement learning[J].arXiv:1912.06680,2019. [3]VINYALS O,BABUSCHKIN I,CZARNECKI W M,et al.Grandmaster level in StarCraft II using multi-agent reinforcement learning[J].Nature,2019,575(7782):350-354. [4]YE D,CHEN G,ZHANG W,et al.Towards playing full moba games with deep reinforcement learning[J].arXiv:2011.12692,2020. [5]ZANG Y C,ZHOU T Y,ZHU J H,et al.Domain-Independent Intelligent Planning Technology and Its Application to Automated Penetration Testing Oriented Attack Path Discovery[J].Journal of Electronics & Information Technology,2020,42(9):2095-2107. [6]ZHOU T,ZANG Y,ZHU J,et al.NIG-AP:a new method forautomated penetration testing[J].Frontiers of Information Technology & Electronic Engineering,2019,20(9):1277-1288. [7]SHMARYAHU D,SHANI G,HOFFMANN J,et al.Simulated penetration testing as contingent planning[C]//Proceedings of the International Conference on Automated Planning and Sche-duling.2018. [8]SARRAUTE C,BUFFET O,HOFFMANN J.POMDPs make better hackers:Accounting for uncertainty in penetration testing[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2012. [9]SCHWARTZ J,KURNIAWATI H,EL-MAHASSNI E.POMDP+ Information-Decay:Incorporating Defender’s Behaviour in Autonomous Penetration Testing[C]//Proceedings of the International Conference on Automated Planning and Scheduling.2020:235-243. [10]ZENNARO F M,ERDODI L.Modeling penetration testing with reinforcement learning using capture-the-flag challenges and tabular Q-learning[J].arXiv:2005.12632,2020. [11]LI T,CAO S J,YIN S W,et al.Optimal method for the generation of the attack path based on the Q-Learning decision[J].Journal of Xidian University,2021,48(1):160-167. [12]SCHWARTZ J,KURNIAWATI H.Autonomous penetrationtesting using reinforcement learning[J].arXiv:1905.05965,2019. [13]BAILLIE C,STANDEN M,SCHWARTZ J,et al.Cyborg:An autonomous cyber operations research gym[J].arXiv:2002.10667,2020. [14]SUTTON R S,BARTO A G.Reinforcement learning:An introduction[M].MIT press,2018. [15]ZHAO X Y,DING S F.Research on Deep Reinforcement Lear-ning[J].Computer Science,2018,45(7):1-6. [16]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Playing atari with deep reinforcement learning[J].arXiv:1312.5602,2013. [17]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529-533. [18]SCHAUL T,QUAN J,ANTONOGLOU I,et al.Prioritized experience replay[J].arXiv:1511.05952,2015. [19]VAN HASSELT H,GUEZ A,SILVER D.Deep reinforcement learning with double Q-learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2016. [20]WANG Z,SCHAUL T,HESSEL M,et al.Dueling network architectures for deep reinforcement learning[C]//International Conference on Machine Learning.PMLR,2016:1995-2003. [21]WUNDER M,LITTMAN M L,BABES M.Classes of multia-gent q-learning dynamics with epsilon-greedy exploration[C]//ICML.2010. [22]FORTUNATO M,AZAR M G,PIOT B,et al.Noisy networks for exploration[J].arXiv:1706.10295,2017. [23]BACKES M,HOFFMANN J,KÜNNEMANN R,et al.Simulated penetration testing and mitigation analysis[J].ArXiv,abs/1705.05088. [24]YANG W Y,BAI C J,CAI C,et al.Survey on Sparse Reward in Deep Reinforcement Learning[J].Computer Science,2020,47(3):182-191. |
[1] | YU Bin, LI Xue-hua, PAN Chun-yu, LI Na. Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning [J]. Computer Science, 2022, 49(7): 248-253. |
[2] | LI Meng-fei, MAO Ying-chi, TU Zi-jian, WANG Xuan, XU Shu-fang. Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient [J]. Computer Science, 2022, 49(7): 271-279. |
[3] | GAO Wen-long, ZHOU Tian-yang, ZHU Jun-hu, ZHAO Zi-heng. Network Attack Path Discovery Method Based on Bidirectional Ant Colony Algorithm [J]. Computer Science, 2022, 49(6A): 516-522. |
[4] | XIE Wan-cheng, LI Bin, DAI Yue-yue. PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing [J]. Computer Science, 2022, 49(6): 3-11. |
[5] | HONG Zhi-li, LAI Jun, CAO Lei, CHEN Xi-liang, XU Zhi-xiong. Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration [J]. Computer Science, 2022, 49(6): 149-157. |
[6] | LI Peng, YI Xiu-wen, QI De-kang, DUAN Zhe-wen, LI Tian-rui. Heating Strategy Optimization Method Based on Deep Learning [J]. Computer Science, 2022, 49(4): 263-268. |
[7] | OUYANG Zhuo, ZHOU Si-yuan, LYU Yong, TAN Guo-ping, ZHANG Yue, XIANG Liang-liang. DRL-based Vehicle Control Strategy for Signal-free Intersections [J]. Computer Science, 2022, 49(3): 46-51. |
[8] | DAI Shan-shan, LIU Quan. Action Constrained Deep Reinforcement Learning Based Safe Automatic Driving Method [J]. Computer Science, 2021, 48(9): 235-243. |
[9] | CHENG Zhao-wei, SHEN Hang, WANG Yue, WANG Min, BAI Guang-wei. Deep Reinforcement Learning Based UAV Assisted SVC Video Multicast [J]. Computer Science, 2021, 48(9): 271-277. |
[10] | LIANG Jun-bin, ZHANG Hai-han, JIANG Chan, WANG Tian-shu. Research Progress of Task Offloading Based on Deep Reinforcement Learning in Mobile Edge Computing [J]. Computer Science, 2021, 48(7): 316-323. |
[11] | WANG Ying-kai, WANG Qing-shan. Reinforcement Learning Based Energy Allocation Strategy for Multi-access Wireless Communications with Energy Harvesting [J]. Computer Science, 2021, 48(7): 333-339. |
[12] | LI Bei-bei, SONG Jia-rui, DU Qing-yun, HE Jun-jiang. DRL-IDS:Deep Reinforcement Learning Based Intrusion Detection System for Industrial Internet of Things [J]. Computer Science, 2021, 48(7): 47-54. |
[13] | FAN Jia-kuan, WANG Hao-yue, ZHAO Sheng-yu, ZHOU Tian-yi, WANG Wei. Data-driven Methods for Quantitative Assessment and Enhancement of Open Source Contributions [J]. Computer Science, 2021, 48(5): 45-50. |
[14] | FAN Yan-fang, YUAN Shuang, CAI Ying, CHEN Ruo-yu. Deep Reinforcement Learning-based Collaborative Computation Offloading Scheme in VehicularEdge Computing [J]. Computer Science, 2021, 48(5): 270-276. |
[15] | HUANG Zhi-yong, WU Hao-lin, WANG Zhuang, LI Hui. DQN Algorithm Based on Averaged Neural Network Parameters [J]. Computer Science, 2021, 48(4): 223-228. |
|