计算机科学 ›› 2014, Vol. 41 ›› Issue (11): 203-207.doi: 10.11896/j.issn.1002-137X.2014.11.040

• 信息安全 • 上一篇    下一篇

基于Q学习的DDoS攻防博弈模型研究

史云放,武东英,刘胜利,高翔   

  1. 数学工程与先进计算国家重点实验室 郑州450002;数学工程与先进计算国家重点实验室 郑州450002;数学工程与先进计算国家重点实验室 郑州450002;数学工程与先进计算国家重点实验室 郑州450002
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家自然科学基金(61309007),郑州市科技创新团队项目(10CXTD150)资助

Research on DDoS Attack-defense Game Model Based on Q-learning

SHI Yun-fang,WU Dong-ying,LIU Sheng-li and GAO Xiang   

  • Online:2018-11-14 Published:2018-11-14

摘要: 新形势下的DDoS攻防博弈过程和以往不同,因此利用现有的方法无法有效地评估量化攻防双方的收益以及动态调整博弈策略以实现收益最大化。针对这一问题,设计了一种基于Q学习的DDoS攻防博弈模型,并在此基础上提出了模型算法。首先,通过网络熵评估量化方法计算攻防双方收益;其次,利用矩阵博弈研究单个DDoS攻击阶段的攻防博弈过程;最后,将Q学习引入博弈过程,提出了模型算法,用以根据学习效果动态调整攻防策略从而实现收益最大化。实验结果表明,采用模型算法的防御方能够获得更高的收益,从而证明了算法的可用性和有效性。

关键词: DDoS攻防,矩阵博弈,Q学习,网络熵,纳什均衡

Abstract: The process of DDoS attack-defense game in new situation is different now,so the payoff value cannot be quantified effectively and the game strategy cannot be adjusted dynamically to maximize the payoff using existing methods.In response to this problem,a DDoS attack-defense game model based on Q-learning was designed,and at the same time an algorithm was proposed on the basis of the model.Firstly,the payoff of the attacker and defender was calculated with the network entropy quantitative assessment method.Secondly,the single DDoS attack stage was studied using matrix game method.Finally,the model algorithm was proposed by introducing the Q-learning method into the game process,with which the strategies are adjusted dynamically according to the learning outcomes to maximize the payoff.The result of verification testing shows that the defender can achieve a higher payoff when adopting the model algorithm,thus the algorithm turns out to be practicable and effective.

Key words: DDoS attack-defense,Matrix game,Q-learning,Network entropy,Nash equilibrium

[1] Worldwide Infrastructure Security Report [EB/OL].[2003-03-01].http://www.arbornetworks.com/research/infrastructure-security-report
[2] Gupta B B,Misra M,Joshi R C.An ISP level solution to combat DDoS attacks using combined statistical based approach[J].arXiv preprint arXiv:1203.2400,2012 (下转第226页)(上接第207页)
[3] Ak M I,George L,Govind K,et al.Threshold Based KernelLevel HTTP Filter (TBHF) for DDoS Mitigation[J].International Journal of Computer Network and Information Security (IJCNIS),2012,4(12):31-39
[4] 刘陶,何炎祥,熊琦.一种基于Q学习的LDoS攻击实时防御机制及其CPN实现[J].计算机研究与发展,2011,8(3):432-439
[5] 黄亮,冯登国,连一峰,等.基于神经网络的 DDoS 防护绩效评估[J].计算机研究与发展,2013,50(10):2100-2108
[6] Bao N,Kreidl O P,Musacchio J.A network security classification game[M]∥Game Theory for Networks.Springer Berlin Heidelberg,2012:265-280
[7] Bommannavar P,Alpcan T,Bambos N.Security risk manage-ment via dynamic games with learning[C]∥2011 IEEE International Conference on Communications (ICC).IEEE,2011:1-6
[8] 陈永强,付钰,吴晓平.基于非零和攻防博弈模型的主动防御策略选取方法[J].计算机应用,2013,33(5):1347-1349,2
[9] 姜伟,方滨兴,田志宏,等.基于攻防博弈模型的网络安全测评和最优主动防御[J].计算机学报,2009,32(4):817-827
[10] 林旺群,王慧,刘家红,等.基于非合作动态博弈的网络安全主动防御技术研究[J].计算机研究与发展,2011,48(2):306-316
[11] 张义荣,鲜明,王国玉.一种基于网络熵的计算机网络攻击效果定量评估方法[J].通信学报,2004,25(11):158-165
[12] Watkins C J C H,Dayan P.Q-learning[J].Machine learning,1992,8(3/4):279-292
[13] Littman M L.Markov games as a framework for multi-agent reinforcement learning[C]∥ICML.1994,94:157-163

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!