计算机科学 ›› 2021, Vol. 48 ›› Issue (5): 301-307.doi: 10.11896/jsjkx.200800174

• 信息安全 • 上一篇    下一篇

基于多Agent联合决策的队组协同攻击规划

周天阳, 曾子懿, 臧艺超, 王清贤   

  1. 战略支援部队信息工程大学 郑州450001
    国家数字交换系统工程技术研究中心 郑州450001
  • 收稿日期:2020-08-27 修回日期:2020-10-30 出版日期:2021-05-15 发布日期:2021-05-09
  • 通讯作者: 周天阳(aipteamzhouty@aliyun.com)

Team Cooperative Attack Planning Based on Multi-agent Joint Decision

ZHOU Tian-yang, ZENG Zi-yi, ZANG Yi-chao, WANG Qing-xian   

  1. Information Engineering University,Zhengzhou 450001,China
    National Digital Switching System Engineering & Technological Research Center,Zhengzhou 450001,China
  • Received:2020-08-27 Revised:2020-10-30 Online:2021-05-15 Published:2021-05-09
  • About author:ZHOU Tian-yang,born in 1979,asso-ciate professor.His main research interests include software vulnerability ana-lysis,virtualization-based security technology and application,penetration test,fundamental study of network modeling and simulation,and cyber security assessment.

摘要: 自动化渗透测试通过将人工找寻可能攻击路径的过程自动化,可大幅降低渗透测试的成本。现有方法主要利用单一Agent执行攻击任务,导致攻击动作执行耗时长,渗透效率不高;若考虑多个Agent协同攻击,由于每个Agent的局部状态有多个维度,总的规划问题的状态空间会呈指数级增长。针对上述问题,提出了基于多Agent联合决策的队组协同攻击规划方法。该方法首先将多Agent协同攻击路径规划问题转化为联合决策约束下的攻击目标分配问题,建立多Agent集中决策模式;然后以CDSO-CAP为模型基础,利用联合决策矢量矩阵JDVM计算渗透攻击奖励,并采用贪婪策略搜索多Agent的最优攻击目标。实验结果表明,与单Agent规划方法相比,该方法的收敛性相近,但执行轮次更短,更适合在多目标网络场景内进行快速攻击规划。

关键词: 队组协同, 攻击规划, 联合决策, 渗透测试, 智能体, 自动化

Abstract: Automated penetration testing can greatly reduce the cost of penetration testing by automating the process of manually finding possible attack paths.Existing methods mainly use a single agent to perform attack tasks,which leads to long execution of attack actions and low penetration efficiency.If multi-agent cooperative attack is considered,the state space scale of planning problem will grow exponentially due to the multi-dimensional local state of each agent.Aiming at the above problems,a team cooperative attack planning method based on multi-agent jointdecision is proposed.Firstly,the multi-agent cooperative attack path planning problem is transformed into the attack target assignment problem under the jointdecision constraints,and themulti-agent centralized decision-making mode is established.Secondly,the joint decision vector matrix JDVM is used to calculate the penetration attack reward based on the CDSO-CAP model,and the greedy strategy is used to search the optimal target of attack.The experimental results show that compared with the single agent planning method,the proposed method has similar algorithm convergence with shorter execution rounds.Thus it is more suitable for rapid attack planning in multi-target network scenarios.

Key words: Agent, Attack planning, Automation, Joint decision, Penetration test, Team collaboration

中图分类号: 

  • TP393.08
[1]OBES J L,SARRAUTE C,RICHARTE G.Attack planning inthe real world[J].arXiv:1306.4044,2013.
[2]SARRAUTE C,RICHARTE G,OBESJ L.AN algorithm to find optimal attack paths in nondeterministic scenarios[C]//Proceedings of th 4th ACM Workshop on Security and Artificial Intelligence.ACM,2011:71-80.
[3]SARRAUTE C,BUFFET O,HOFFMANN J.POMDPs makebetter hackers:Accounting for uncertainty in penetration testing[C]//Twenty-Sixth AAAI Conference on Artificail Intelligence.2012.
[4]SHMARYAHU D,SHANI G,HOFFMANN J,et al.Partially observable contingent planning for penetration testing[C]//IWAISe:First International Workshop on Artificail Intellijgence in Security.2017,33.
[5]MCLENNAN A .The expected number of Nash equilibria of a normal form game[J].Econometrica,2005,73(1):141-174.
[6]BOUTILIER C.Planning,learning and coordination in multi-agent decision processes[C]//Proceedings of the 6th conference on Theoretical aspects of rationality and knowledge.Morgan Kaufmann Publishers Inc.1996:195-210.
[7]MUSLINER D J,DURFEE E H,WU J,et al.Coordinated Plan Management Using Multiagent MDPs[C]//AAAI Spring Symposium:DIstributed Plan and Schedule Management.2006:73-70.
[8]ONTANON S,BURO M.Adversarial hierachical-task network planning for complex real-time games[C]//Twenty-Fourth International Joint Conference on Artificail Intelligence.2015.
[9]SARRAUTE C,BUFFET O,HOFFMANN J.Penetration Testing==POMDP Solving?[J].arXiv:1306.4714,2013.
[10]KOTENKO I.Agent-based modeling and simulation of cyber-warfare between malefactors and security agents in Internet[C]//the 19th European Simulation Multiconference “Simulation in wider Europe”.2005.
[11]ROTH M,SIMMONS R,VELOSO M.What to communicate? Execution-time decision in multi-agent POMDPs[M]//Distributed Autonomous Robotic Systems 7.Springer,Tokyo,2006:177-186.
[12]ZHANG C,LESSER V.Coordinated multi-agent reinforcement learning in networked distributed POMDPs[C]//Proceedings of the 25th AAAI Conference on Artificial Intelligence.San Francisco,America,2011:764-770.
[13]ZHOU T Y,ZANG Y C,ZHU J H,et al.NIG-AP:a new me-thod for automated penetration testing[J].Frontiers of Information Technology & Electronic Engineering,2019,20(9):1277-1298.
[1] 冷典典, 杜鹏, 陈建廷, 向阳.
面向自动化集装箱码头的AGV行驶时间估计
Automated Container Terminal Oriented Travel Time Estimation of AGV
计算机科学, 2022, 49(9): 208-214. https://doi.org/10.11896/jsjkx.210700028
[2] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[3] 史殿习, 赵琛然, 张耀文, 杨绍武, 张拥军.
基于多智能体强化学习的端到端合作的自适应奖励方法
Adaptive Reward Method for End-to-End Cooperation Based on Multi-agent Reinforcement Learning
计算机科学, 2022, 49(8): 247-256. https://doi.org/10.11896/jsjkx.210700100
[4] 王岩松, 秦云川, 蔡宇辉, 李肯立.
一种基于UIA接口的RPA系统设计方法
Design and Implementation of RPA System Based on UIA Interface
计算机科学, 2022, 49(8): 225-229. https://doi.org/10.11896/jsjkx.211100046
[5] 高文龙, 周天阳, 朱俊虎, 赵子恒.
基于双向蚁群算法的网络攻击路径发现方法
Network Attack Path Discovery Method Based on Bidirectional Ant Colony Algorithm
计算机科学, 2022, 49(6A): 516-522. https://doi.org/10.11896/jsjkx.210500072
[6] 张明新.
面向超大规模社会系统仿真的概念模型
Conceptual Model for Large-scale Social Simulation
计算机科学, 2022, 49(4): 16-24. https://doi.org/10.11896/jsjkx.210900136
[7] 周仕承, 刘京菊, 钟晓峰, 卢灿举.
基于深度强化学习的智能化渗透测试路径发现
Intelligent Penetration Testing Path Discovery Based on Deep Reinforcement Learning
计算机科学, 2021, 48(7): 40-46. https://doi.org/10.11896/jsjkx.210400057
[8] 黄双芹, 刘英博, 黄向生.
模型驱动开发工具的自动化测试技术研究
Research on Automatic Testing Technology of Model Driven Development Tools
计算机科学, 2021, 48(6A): 568-571. https://doi.org/10.11896/jsjkx.201000139
[9] 曹浩, 郭绍忠, 刘聃, 许瑾晨.
面向64位RISC-V的基础数学库自动化移植
Automatic Porting of Basic Mathematics Library for 64-bit RISC-V
计算机科学, 2021, 48(6): 41-47. https://doi.org/10.11896/jsjkx.201200058
[10] 高枫越, 王琰, 朱铁兰.
有适应力的分布式状态估计方法
Resilient Distributed State Estimation Algorithm
计算机科学, 2021, 48(5): 308-312. https://doi.org/10.11896/jsjkx.200300117
[11] 左剑凯, 吴杰宏, 陈嘉彤, 刘泽源, 李忠智.
异构无人机编队防御及评估策略研究
Study on Heterogeneous UAV Formation Defense and Evaluation Strategy
计算机科学, 2021, 48(2): 55-63. https://doi.org/10.11896/jsjkx.191100053
[12] 刘芳, 洪玫, 王潇, 郭丹, 杨正卉, 黄小丹.
面向Java的Randoop自动化单元测试生成工具性能分析
Performance Analysis of Randoop Automated Unit Test Generation Tool for Java
计算机科学, 2020, 47(9): 24-30. https://doi.org/10.11896/jsjkx.200200116
[13] 罗云芳, 唐承娥, 韦军.
基于粗糙规则的脉冲神经膜系统计算能力的研究
Computing Ability of Spiking Neural P System Based on Rough Rules
计算机科学, 2020, 47(6A): 626-630. https://doi.org/10.11896/JsJkx.190500120
[14] 孟繁祎, 王莹, 于海, 朱志良.
复杂软件系统的重构技术:现状、问题与展望
Refactoring of Complex Software Systems Research:PresentProblem and Prospect
计算机科学, 2020, 47(12): 1-10. https://doi.org/10.11896/jsjkx.200800067
[15] 柴锐, 薛凡, 曾建潮, 秦品乐.
一种医学肾动态显像自动化定量评估方法
Automatic Quantitative Evaluation Approach for Medical Renal Dynamic Imaging
计算机科学, 2019, 46(8): 321-326. https://doi.org/10.11896/j.issn.1002-137X.2019.08.053
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!