基于多阶段博弈的虚拟化蜜罐动态部署机制

doi:10.11896/jsjkx.210500071

计算机科学 ›› 2021, Vol. 48 ›› Issue (10): 294-300.doi: 10.11896/jsjkx.210500071

基于多阶段博弈的虚拟化蜜罐动态部署机制

高雅卓, 刘亚群, 张国敏, 邢长友, 王秀磊

陆军工程大学指挥控制工程学院南京210007

收稿日期:2021-05-12 修回日期:2021-08-12 出版日期:2021-10-15 发布日期:2021-10-18
通讯作者: 邢长友(changyouxing@126.com)
作者简介:gyz9396@163.com
基金资助:
国家自然科学基金项目(61379149, 61772271); 江苏省自然科学基金青年基金(SBK2020043435)

Multi-stage Game Based Dynamic Deployment Mechanism of Virtualized Honeypots

GAO Ya-zhuo, LIU Ya-qun, ZHANG Guo-min, XING Chang-you, WANG Xiu-lei

College of Command & Control Engineering,Army Engineering University of PLA,Nanjing 210007,China

Received:2021-05-12 Revised:2021-08-12 Online:2021-10-15 Published:2021-10-18
About author:GAO Ya-zhuo,born in 1998,master's degree.Her main research interests include cyberspace security,and so on.
XING Chang-you,born in 1982,Ph.D,associate professor.His main research interests include software defined network,network measurement.
Supported by:
Natural Science Foundation of China(61379149,61772271) and Natural Science Foundation of Jiangsu Province(SBK2020043435).

摘要/Abstract

摘要： 作为一种重要的欺骗防御手段,蜜罐对于增强网络主动防御能力具有重要意义,但现有蜜罐大多采用静态部署方法,难以高效应对攻击者的策略性探测攻击等行为。为此,将完全信息静态博弈与马尔可夫决策过程相结合,提出了一种基于多阶段随机博弈的虚拟化蜜罐动态部署机制HoneyVDep。HoneyVDep结合攻防双方多阶段持续对抗的特点,以资源约束下防御方的综合收益最大化为目标,建立了基于多阶段随机博弈对抗的蜜罐部署优化模型,并实现了基于Q_Learning的求解算法,以快速应对攻击者的策略性探测攻击行为。最后,基于软件定义网络和虚拟化容器实现了一个可扩展的原型系统对HoneyVDep进行验证,实验结果表明,HoneyVDep能够根据攻击者的攻击行为特征,有效生成蜜罐部署策略,提升对攻击者的诱捕率,减少部署成本。

关键词: 多阶段博弈, 容器, 软件定义网络, 深度强化学习, 虚拟化蜜罐

Abstract: As an important deception defense method,honeypot is of great significance to enhance the network active defense capability.However,most of the existing honeypots adopt the static deployment method,which is difficult to deal with the strategic attacks effectively.Therefore,by combining the complete information static game with Markov decision process,we propose a multi-stage stochastic game based dynamic deployment mechanism HoneyVDep.By taking the resource constrained maximum comprehensive gain of the defensive side as the goal,HoneyVDep establishes a multi-stage random game based honeypot deployment optimization model.Besides,we also implement a Q_Learning based solution algorithm,which can deal with the attacker's strategic detection attack behavior quickly.Finally,based on software defined network and virtualization containers,we implement an extensible prototype system.The experimental results show that HoneyVDep can effectively generate honeypot deployment strategy according to the characteristics of the attacker's attack behavior,improve the trapping rate of the attackers,and reduce the deployment cost.

Key words: Container, Deep reinforcement learning, Multi stage game, Software defined network, Virtual honeypot

中图分类号:

TP393.00

高雅卓, 刘亚群, 张国敏, 邢长友, 王秀磊. 基于多阶段博弈的虚拟化蜜罐动态部署机制[J]. 计算机科学, 2021, 48(10): 294-300. https://doi.org/10.11896/jsjkx.210500071

GAO Ya-zhuo, LIU Ya-qun, ZHANG Guo-min, XING Chang-you, WANG Xiu-lei. Multi-stage Game Based Dynamic Deployment Mechanism of Virtualized Honeypots[J]. Computer Science, 2021, 48(10): 294-300. https://doi.org/10.11896/jsjkx.210500071

参考文献

[1]STOLL C.The cuckoo's egg:Tracking a spy through the maze of computer espionage [M].London:The Bodley Head Ltd,1989.
[2]SHI L,LI Y,MA M.Latest Research Progress of HonepotTechnolog[J].Journal of Electronics & Information Technology,2019,41(2):498-508.
[3]SPITZNER L.Honeypots:Tracking hackers [M].Addison-Wesley Reading,2003.
[4]KAMEL N E,EDDABBAH M,LMOUMEN Y,et al.A smart agent design for cyber security based on honeypot and machine learning[J].Security and Communication Networks,2020,9(8):1-9.
[5]WAGENER G,STATE R,DULAUNOY A,et al.Heliza:Tal-king dirty to the attackers[J].Journal in Computer Virology,2011,7:221-232.
[6]PAUNA A,IACOB A C,BICA I.Qrassh-a self-adaptive ssh honeypot driven by q-learning[C]//2018 International Conference on Communications (COMM).2018:441-446.
[7]HUANG L,ZHU Q.Adaptive Honeypot Engagement Through Reinforcement Learning of Semi-Markov Decision Processes[C]//Decision and Game Theory for Security(GameSec 2019).2019:196-216.
[8]BOUMKHELD N,PANDA S,RASS S,et al.Honeypot type selection games for smart grid networks[C]//Conference on Decision & Game Theory for Security.Vienna,Austria:Springer International Publishing,2019:85-96.
[9]SARR A B,ANWAR A H,KAMHOUA C,et al.Software diversity for cyber deception[C]//IEEE Global Communications Conference.2020:1-6.
[10]ATTIAH A,CHATTERJEE M,ZOU C C.A game theoretic approach to model cyber attack and defense strategies[C]//2018 IEEE International Conference on Communications (ICC).2018:1-7.
[11]ANWAR A H,KAMHOUA C A,LESLIE N.Honeypot allocation over attack graphs in cyber deception games[C]//ICNC,USA.IEEE,2020.
[12]FILAR J,VRIEZE K.Competitive markov decision processes[M].Competitive Markov Decision Processes,1996.
[13]ZHANG H,YANG J,ZHANG C.Defense decision-makingmethod based on incomplete information stochastic game and Q-learning[J].Journal on Cmmunications,2018,39(8):56-68.
[14]WATKINS C J C H,DAYAN P.Technical note:Q-learning[J].Machine Learning,1992,8(3/4):279-292.
[15]SOLTESZ S,PÖTZL H,FIUCZYNSKI M E,et al.Container-based operating system virtualization:A scalable,high-perfor-mance alternative to hypervisors[J].ACM SIGOPS Operating Systems Review,2007,41:275-287.
[16]MERKEL D.Docker:Lightweight linux containers for consis-tent development and deployment[J].Linux Journal,2014.https://dl.acm.org/doi/10.5555/2600239.2600241.
[17]NICK F,JENNIFER R,ELLEN Z.The road to SDN:An intel-lectual history of programmable networks [C]//ACM SIGCOMM Computer Communication Review.2014:87-98.
[18]ZHANG W,ZHANG B,ZHOU Y,et al.An iot honeynet based on multiport honeypots for capturing iot attacks[J].IEEE Internet of Things Journal,2020,7(5):3991-3999.
[19]WANG J,YANG H,FAN C.A SDN Dynamic Honeypot with Multi-phase Attack Response[J].Netinfo Security,2021,21(1):27-40.
[20]FAN W,DU Z,SMITH-CREASEY M,et al.Honeydoc:An efficient honeypot architecture enabling all-round design[J].IEEE Journal on Selected Areas in Communications,2019,37(3):683-697.
[21]XING J,YANG M,ZHOU H,et al.Hiding and trapping:A deceptive approach for defending against network reconnaissance with software-defined network[C]//2019 IEEE 38th International Performance Computing and Communications Conference (IPCCC).London,United Kingdom:IEEE,2019:1-8.
[22]GUTIERREZ M,KIEKINTVELD C.Online learning methodsfor controlling dynamic cyber deception strategies [C]//Adaptive Autonomous Secure Cyber Systems.2020:231-251.

相关文章 15

[1]	熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[2]	于滨, 李学华, 潘春雨, 李娜. 基于深度强化学习的边云协同资源分配算法 Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning 计算机科学, 2022, 49(7): 248-253. https://doi.org/10.11896/jsjkx.210400219
[3]	李梦菲, 毛莺池, 屠子健, 王瑄, 徐淑芳. 基于深度确定性策略梯度的服务器可靠性任务卸载策略 Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient 计算机科学, 2022, 49(7): 271-279. https://doi.org/10.11896/jsjkx.210600040
[4]	谢万城, 李斌, 代玥玥. 空中智能反射面辅助边缘计算中基于PPO的任务卸载方案 PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing 计算机科学, 2022, 49(6): 3-11. https://doi.org/10.11896/jsjkx.220100249
[5]	洪志理, 赖俊, 曹雷, 陈希亮, 徐志雄. 基于遗憾探索的竞争网络强化学习智能推荐方法研究 Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration 计算机科学, 2022, 49(6): 149-157. https://doi.org/10.11896/jsjkx.210600226
[6]	李鹏, 易修文, 齐德康, 段哲文, 李天瑞. 一种基于深度学习的供热策略优化方法 Heating Strategy Optimization Method Based on Deep Learning 计算机科学, 2022, 49(4): 263-268. https://doi.org/10.11896/jsjkx.210300155
[7]	欧阳卓, 周思源, 吕勇, 谭国平, 张悦, 项亮亮. 基于深度强化学习的无信号灯交叉路口车辆控制 DRL-based Vehicle Control Strategy for Signal-free Intersections 计算机科学, 2022, 49(3): 46-51. https://doi.org/10.11896/jsjkx.210700010
[8]	耿海军, 王威, 尹霞. 基于混合软件定义网络的单节点故障保护方法 Single Node Failure Routing Protection Algorithm Based on Hybrid Software Defined Networks 计算机科学, 2022, 49(2): 329-335. https://doi.org/10.11896/jsjkx.210100051
[9]	代珊珊, 刘全. 基于动作约束深度强化学习的安全自动驾驶方法 Action Constrained Deep Reinforcement Learning Based Safe Automatic Driving Method 计算机科学, 2021, 48(9): 235-243. https://doi.org/10.11896/jsjkx.201000084
[10]	成昭炜, 沈航, 汪悦, 王敏, 白光伟. 基于深度强化学习的无人机辅助弹性视频多播机制 Deep Reinforcement Learning Based UAV Assisted SVC Video Multicast 计算机科学, 2021, 48(9): 271-277. https://doi.org/10.11896/jsjkx.201000078
[11]	周仕承, 刘京菊, 钟晓峰, 卢灿举. 基于深度强化学习的智能化渗透测试路径发现 Intelligent Penetration Testing Path Discovery Based on Deep Reinforcement Learning 计算机科学, 2021, 48(7): 40-46. https://doi.org/10.11896/jsjkx.210400057
[12]	李贝贝, 宋佳芮, 杜卿芸, 何俊江. DRL-IDS:基于深度强化学习的工业物联网入侵检测系统 DRL-IDS:Deep Reinforcement Learning Based Intrusion Detection System for Industrial Internet of Things 计算机科学, 2021, 48(7): 47-54. https://doi.org/10.11896/jsjkx.210400021
[13]	梁俊斌, 张海涵, 蒋婵, 王天舒. 移动边缘计算中基于深度强化学习的任务卸载研究进展 Research Progress of Task Offloading Based on Deep Reinforcement Learning in Mobile Edge Computing 计算机科学, 2021, 48(7): 316-323. https://doi.org/10.11896/jsjkx.200800095
[14]	王英恺, 王青山. 能量收集无线通信系统中基于强化学习的能量分配策略 Reinforcement Learning Based Energy Allocation Strategy for Multi-access Wireless Communications with Energy Harvesting 计算机科学, 2021, 48(7): 333-339. https://doi.org/10.11896/jsjkx.201100154
[15]	刘邦邦, 易国洪, 黄祖源. 面向Docker容器的动态负载算法 Dynamic Loading Algorithm for Docker Container 计算机科学, 2021, 48(6): 276-281. https://doi.org/10.11896/jsjkx.200500152

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于多阶段博弈的虚拟化蜜罐动态部署机制

Multi-stage Game Based Dynamic Deployment Mechanism of Virtualized Honeypots

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

Metrics

本文评价

推荐阅读 0