计算机科学 ›› 2021, Vol. 48 ›› Issue (10): 294-300.doi: 10.11896/jsjkx.210500071
高雅卓, 刘亚群, 张国敏, 邢长友, 王秀磊
GAO Ya-zhuo, LIU Ya-qun, ZHANG Guo-min, XING Chang-you, WANG Xiu-lei
摘要: 作为一种重要的欺骗防御手段,蜜罐对于增强网络主动防御能力具有重要意义,但现有蜜罐大多采用静态部署方法,难以高效应对攻击者的策略性探测攻击等行为。为此,将完全信息静态博弈与马尔可夫决策过程相结合,提出了一种基于多阶段随机博弈的虚拟化蜜罐动态部署机制HoneyVDep。HoneyVDep结合攻防双方多阶段持续对抗的特点,以资源约束下防御方的综合收益最大化为目标,建立了基于多阶段随机博弈对抗的蜜罐部署优化模型,并实现了基于Q_Learning的求解算法,以快速应对攻击者的策略性探测攻击行为。最后,基于软件定义网络和虚拟化容器实现了一个可扩展的原型系统对HoneyVDep进行验证,实验结果表明,HoneyVDep能够根据攻击者的攻击行为特征,有效生成蜜罐部署策略,提升对攻击者的诱捕率,减少部署成本。
中图分类号:
[1]STOLL C.The cuckoo's egg:Tracking a spy through the maze of computer espionage [M].London:The Bodley Head Ltd,1989. [2]SHI L,LI Y,MA M.Latest Research Progress of HonepotTechnolog[J].Journal of Electronics & Information Technology,2019,41(2):498-508. [3]SPITZNER L.Honeypots:Tracking hackers [M].Addison-Wesley Reading,2003. [4]KAMEL N E,EDDABBAH M,LMOUMEN Y,et al.A smart agent design for cyber security based on honeypot and machine learning[J].Security and Communication Networks,2020,9(8):1-9. [5]WAGENER G,STATE R,DULAUNOY A,et al.Heliza:Tal-king dirty to the attackers[J].Journal in Computer Virology,2011,7:221-232. [6]PAUNA A,IACOB A C,BICA I.Qrassh-a self-adaptive ssh honeypot driven by q-learning[C]//2018 International Conference on Communications (COMM).2018:441-446. [7]HUANG L,ZHU Q.Adaptive Honeypot Engagement Through Reinforcement Learning of Semi-Markov Decision Processes[C]//Decision and Game Theory for Security(GameSec 2019).2019:196-216. [8]BOUMKHELD N,PANDA S,RASS S,et al.Honeypot type selection games for smart grid networks[C]//Conference on Decision & Game Theory for Security.Vienna,Austria:Springer International Publishing,2019:85-96. [9]SARR A B,ANWAR A H,KAMHOUA C,et al.Software diversity for cyber deception[C]//IEEE Global Communications Conference.2020:1-6. [10]ATTIAH A,CHATTERJEE M,ZOU C C.A game theoretic approach to model cyber attack and defense strategies[C]//2018 IEEE International Conference on Communications (ICC).2018:1-7. [11]ANWAR A H,KAMHOUA C A,LESLIE N.Honeypot allocation over attack graphs in cyber deception games[C]//ICNC,USA.IEEE,2020. [12]FILAR J,VRIEZE K.Competitive markov decision processes[M].Competitive Markov Decision Processes,1996. [13]ZHANG H,YANG J,ZHANG C.Defense decision-makingmethod based on incomplete information stochastic game and Q-learning[J].Journal on Cmmunications,2018,39(8):56-68. [14]WATKINS C J C H,DAYAN P.Technical note:Q-learning[J].Machine Learning,1992,8(3/4):279-292. [15]SOLTESZ S,PÖTZL H,FIUCZYNSKI M E,et al.Container-based operating system virtualization:A scalable,high-perfor-mance alternative to hypervisors[J].ACM SIGOPS Operating Systems Review,2007,41:275-287. [16]MERKEL D.Docker:Lightweight linux containers for consis-tent development and deployment[J].Linux Journal,2014.https://dl.acm.org/doi/10.5555/2600239.2600241. [17]NICK F,JENNIFER R,ELLEN Z.The road to SDN:An intel-lectual history of programmable networks [C]//ACM SIGCOMM Computer Communication Review.2014:87-98. [18]ZHANG W,ZHANG B,ZHOU Y,et al.An iot honeynet based on multiport honeypots for capturing iot attacks[J].IEEE Internet of Things Journal,2020,7(5):3991-3999. [19]WANG J,YANG H,FAN C.A SDN Dynamic Honeypot with Multi-phase Attack Response[J].Netinfo Security,2021,21(1):27-40. [20]FAN W,DU Z,SMITH-CREASEY M,et al.Honeydoc:An efficient honeypot architecture enabling all-round design[J].IEEE Journal on Selected Areas in Communications,2019,37(3):683-697. [21]XING J,YANG M,ZHOU H,et al.Hiding and trapping:A deceptive approach for defending against network reconnaissance with software-defined network[C]//2019 IEEE 38th International Performance Computing and Communications Conference (IPCCC).London,United Kingdom:IEEE,2019:1-8. [22]GUTIERREZ M,KIEKINTVELD C.Online learning methodsfor controlling dynamic cyber deception strategies [C]//Adaptive Autonomous Secure Cyber Systems.2020:231-251. |
[1] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[2] | 于滨, 李学华, 潘春雨, 李娜. 基于深度强化学习的边云协同资源分配算法 Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning 计算机科学, 2022, 49(7): 248-253. https://doi.org/10.11896/jsjkx.210400219 |
[3] | 李梦菲, 毛莺池, 屠子健, 王瑄, 徐淑芳. 基于深度确定性策略梯度的服务器可靠性任务卸载策略 Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient 计算机科学, 2022, 49(7): 271-279. https://doi.org/10.11896/jsjkx.210600040 |
[4] | 谢万城, 李斌, 代玥玥. 空中智能反射面辅助边缘计算中基于PPO的任务卸载方案 PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing 计算机科学, 2022, 49(6): 3-11. https://doi.org/10.11896/jsjkx.220100249 |
[5] | 洪志理, 赖俊, 曹雷, 陈希亮, 徐志雄. 基于遗憾探索的竞争网络强化学习智能推荐方法研究 Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration 计算机科学, 2022, 49(6): 149-157. https://doi.org/10.11896/jsjkx.210600226 |
[6] | 李鹏, 易修文, 齐德康, 段哲文, 李天瑞. 一种基于深度学习的供热策略优化方法 Heating Strategy Optimization Method Based on Deep Learning 计算机科学, 2022, 49(4): 263-268. https://doi.org/10.11896/jsjkx.210300155 |
[7] | 欧阳卓, 周思源, 吕勇, 谭国平, 张悦, 项亮亮. 基于深度强化学习的无信号灯交叉路口车辆控制 DRL-based Vehicle Control Strategy for Signal-free Intersections 计算机科学, 2022, 49(3): 46-51. https://doi.org/10.11896/jsjkx.210700010 |
[8] | 耿海军, 王威, 尹霞. 基于混合软件定义网络的单节点故障保护方法 Single Node Failure Routing Protection Algorithm Based on Hybrid Software Defined Networks 计算机科学, 2022, 49(2): 329-335. https://doi.org/10.11896/jsjkx.210100051 |
[9] | 代珊珊, 刘全. 基于动作约束深度强化学习的安全自动驾驶方法 Action Constrained Deep Reinforcement Learning Based Safe Automatic Driving Method 计算机科学, 2021, 48(9): 235-243. https://doi.org/10.11896/jsjkx.201000084 |
[10] | 成昭炜, 沈航, 汪悦, 王敏, 白光伟. 基于深度强化学习的无人机辅助弹性视频多播机制 Deep Reinforcement Learning Based UAV Assisted SVC Video Multicast 计算机科学, 2021, 48(9): 271-277. https://doi.org/10.11896/jsjkx.201000078 |
[11] | 周仕承, 刘京菊, 钟晓峰, 卢灿举. 基于深度强化学习的智能化渗透测试路径发现 Intelligent Penetration Testing Path Discovery Based on Deep Reinforcement Learning 计算机科学, 2021, 48(7): 40-46. https://doi.org/10.11896/jsjkx.210400057 |
[12] | 李贝贝, 宋佳芮, 杜卿芸, 何俊江. DRL-IDS:基于深度强化学习的工业物联网入侵检测系统 DRL-IDS:Deep Reinforcement Learning Based Intrusion Detection System for Industrial Internet of Things 计算机科学, 2021, 48(7): 47-54. https://doi.org/10.11896/jsjkx.210400021 |
[13] | 梁俊斌, 张海涵, 蒋婵, 王天舒. 移动边缘计算中基于深度强化学习的任务卸载研究进展 Research Progress of Task Offloading Based on Deep Reinforcement Learning in Mobile Edge Computing 计算机科学, 2021, 48(7): 316-323. https://doi.org/10.11896/jsjkx.200800095 |
[14] | 王英恺, 王青山. 能量收集无线通信系统中基于强化学习的能量分配策略 Reinforcement Learning Based Energy Allocation Strategy for Multi-access Wireless Communications with Energy Harvesting 计算机科学, 2021, 48(7): 333-339. https://doi.org/10.11896/jsjkx.201100154 |
[15] | 刘邦邦, 易国洪, 黄祖源. 面向Docker容器的动态负载算法 Dynamic Loading Algorithm for Docker Container 计算机科学, 2021, 48(6): 276-281. https://doi.org/10.11896/jsjkx.200500152 |
|