Computer Science ›› 2023, Vol. 50 ›› Issue (11A): 230200162-6.doi: 10.11896/jsjkx.230200162

• Information Security • Previous Articles     Next Articles

Implementation and Verification of Reinforcement Learning Strategy in Automated Red Teaming Testing

CHEN Yufei1, LI Saifei1, ZHANG Lijie2, ZHAO Yue3   

  1. 1 College of Information Science and Technology,Southwest Jiaotong University,Chengdu 611756,China
    2 Norla Institute of Technical Physics,Chengdu 610041,China
    3 Science and Technology on Communication Security Laboratory,Chengdu 610041,China
  • Published:2023-11-09
  • About author:CHEN Yufei,born in 1997,postgraduate.His main research interests include cyberspace security and reinforcement learning.
    LI Saifei,born in 1988,Ph.D,engineer.His main research interests include cyberspace security and so on.
  • Supported by:
    Sichuan Science & Technology Planning Project(2021YJ0372) and Sichuan Science & Technology Major Special Project(2019ZDZX0007,2021YFQ0056) and Science and Technology on Communication Security Laboratory Foundation(61421030201022108).

Abstract: Red teaming testing is a method to evaluate the security of network system by simulating real hacker attack behavior.However,manual test has the problems of high cost and poor adaptability at present.Red teaming testing intelligence and automation is currently a hot research topic,aiming at reducing the cost of red teaming testing and improving the test performance and efficiency of cybersecurity assessments.Automated attack strategy is the core of automated red teaming testing,it is designed to replace security experts in the attack technology decision-making process.In this paper,the red teaming attack technique is mapped to reinforcement learning,the red teaming testing process is modeled as a Markov decision process model,and the fixed strategy and reinforcement learning strategy are implemented through the finite state machine.Reinforcement learning strategy is trained and tested in the real network environment to verify the convergence and feasibility.Experimental results show that the SARSA(λ) algorithm is superior to other reinforcement learning algorithms and has the fastest convergence speed.The three reinforcement learning strategies can achieve the test objective stably in the test experiment,and the performance is much better than that of the fixed strategy.

Key words: Cybersecurity, Red teaming, Automated attack strategy, Penetration testing, Reinforcement learning

CLC Number: 

  • TP393
[1]XIONG Y.Design and Implementation of Automatic Penetration Testing Platform[D].Beijing:Beijing University of Posts and Telecommunications,2019.
[2]APPLEBAUM A,MILLER D,STROM B,et al.Intelligent,Automated Red Team Emulation[C]//Proceedings of the 32nd Annual Conference on Computer Security Applications.ACM,2016:363-373.
[3]GANGUPANTULU R,CODY T,PARK P,et al.Using Cyber Terrain in Reinforcement Learning for Penetration Testing[C]//2022 IEEE International Conference on Omni-layer Intelligent Systems(COINS).IEEE,2022:1-8.
[4]HU Z,BEURAN R,TAN Y.Automated Penetration TestingUsing Deep Reinforcement Learning[C]//2020 IEEE European Symposium on Security and Privacy Workshops(EuroS&PW).IEEE,2020:2-10.
[5]POZDNIAKOV K,ALONSO E,STANKOVIC V,et al.SmartSecurity Audit:Reinforcement Learning with a Deep Neural Network Approximator[C]//2020 International Conference on Cyber Situational Awareness,Data Analytics and Assessment(CyberSA).IEEE,2020:1-8.
[6]SARRAUTE C,BUFFET O,HOFFMANN J.POMDPs MakeBetter Hackers:Accounting for Uncertainty in Penetration Testing[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2012:1816-1824.
[7]SHMARYAHU D,SHANI G,HOFFMANN J,et al.Simulated Penetration Testing as Contingent Planning[C]//Proceedings of the International Conference on Automated Planning and Sche-duling.2018:241-249.
[8]LI T,CAO S J,YIN S W,et al.Optimal method for the generation of the attack path based on the Q-Learning decision[J].Journal of Xidian University,2021,48(1):160-167.
[9]MAEDA R,MIMURA M.Automating post-exploitation withdeep reinforcement learning[J].Computers & Security,2021,100:102-108.
[10]The MITRE ATT&CK.Adversarial Tactics,Techniques,andCommon Knowledge[EB/OL].(2022-10-25)[2022-12-13].https://attack.mitre.org/.
[11]The MITRE CALDERA.A Scalable,Automated AdversaryEmulation Platform[EB/OL].(2022-09-20)[2022-12-13].https://caldera.mitre.org/.
[12]QIN Z H,LI N,LIU X T,et al.Overview of Research on Model-free Reinforcement Learning[J].Computes Science,2021,48(3):180-187.
[13]GAO Y,CHEN S F,LU X.Research on Reinforcement Learning Technology:A Review[J].Acta Automatica Sinica,2004,30(1):86-100.
[14]CHEN S L,WEI Y M.Least-squares SARSA(Lambda) algorithms for reinforcement learning[C]//2008 Fourth International Conference on Natural Computation.IEEE,2008:632-636.
[1] WANG Jing, ZHANG Miao, LIU Yang, LI Haoling, LI Haotian, WANG Bailing, WEI Yuliang. Study on Dual-security Knowledge Graph for Process Industrial Control [J]. Computer Science, 2023, 50(9): 68-74.
[2] LIU Xingguang, ZHOU Li, ZHANG Xiaoying, CHEN Haitao, ZHAO Haitao, WEI Jibo. Edge Intelligent Sensing Based UAV Space Trajectory Planning Method [J]. Computer Science, 2023, 50(9): 311-317.
[3] LIN Xinyu, YAO Zewei, HU Shengxi, CHEN Zheyi, CHEN Xing. Task Offloading Algorithm Based on Federated Deep Reinforcement Learning for Internet of Vehicles [J]. Computer Science, 2023, 50(9): 347-356.
[4] XIONG Liqin, CAO Lei, CHEN Xiliang, LAI Jun. Value Factorization Method Based on State Estimation [J]. Computer Science, 2023, 50(8): 202-208.
[5] ZHANG Naixin, CHEN Xiaorui, LI An, YANG Leyao, WU Huaming. Edge Offloading Framework for D2D-MEC Networks Based on Deep Reinforcement Learningand Wireless Charging Technology [J]. Computer Science, 2023, 50(8): 233-242.
[6] XING Linquan, XIAO Yingmin, YANG Zhibin, WEI Zhengmin, ZHOU Yong, GAO Saijun. Spacecraft Rendezvous Guidance Method Based on Safe Reinforcement Learning [J]. Computer Science, 2023, 50(8): 271-279.
[7] JIN Tiancheng, DOU Liang, ZHANG Wei, XIAO Chunyun, LIU Feng, ZHOU Aimin. OJ Exercise Recommendation Model Based on Deep Reinforcement Learning and Program Analysis [J]. Computer Science, 2023, 50(8): 58-67.
[8] ZENG Qingwei, ZHANG Guomin, XING Changyou, SONG Lihua. Intelligent Attack Path Discovery Based on Hierarchical Reinforcement Learning [J]. Computer Science, 2023, 50(7): 308-316.
[9] LIN Xiangyang, XING Qinghua, XING Huaixi. Study on Intelligent Decision Making of Aerial Interception Combat of UAV Group Based onMADDPG [J]. Computer Science, 2023, 50(6A): 220700031-7.
[10] WANG Hanmo, ZHENG Shijie, XU Ruonan, GUO Bin, WU Lei. Self Reconfiguration Algorithm of Modular Robot Based on Swarm Agent Deep Reinforcement Learning [J]. Computer Science, 2023, 50(6): 266-273.
[11] MIAO Kuan, LI Chongshou. Optimization Algorithms for Job Shop Scheduling Problems Based on Correction Mechanisms and Reinforcement Learning [J]. Computer Science, 2023, 50(6): 274-282.
[12] SHI Liang, WEN Liangming, LEI Sheng, LI Jianhui. Virtual Machine Consolidation Algorithm Based on Decision Tree and Improved Q-learning by Uniform Distribution [J]. Computer Science, 2023, 50(6): 36-44.
[13] ZHANG Qiyang, CHEN Xiliang, CAO Lei, LAI Jun, SHENG Lei. Survey on Knowledge Transfer Method in Deep Reinforcement Learning [J]. Computer Science, 2023, 50(5): 201-216.
[14] YU Ze, NING Nianwen, ZHENG Yanliu, LYU Yining, LIU Fuqiang, ZHOU Yi. Review of Intelligent Traffic Signal Control Strategies Driven by Deep Reinforcement Learning [J]. Computer Science, 2023, 50(4): 159-171.
[15] XU Linling, ZHOU Yuan, HUANG Hongyun, LIU Yang. Real-time Trajectory Planning Algorithm Based on Collision Criticality and Deep Reinforcement Learning [J]. Computer Science, 2023, 50(3): 323-332.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!