基于深度强化学习的二进制代码模糊测试方法

doi:10.11896/jsjkx.230800078

计算机科学 ›› 2024, Vol. 51 ›› Issue (6A): 230800078-7.doi: 10.11896/jsjkx.230800078

• 计算机软件&体系架构 • 上一篇下一篇

基于深度强化学习的二进制代码模糊测试方法

王栓奇¹, 赵健鑫², 刘驰², 武伟¹, 刘钊¹

1 中国兵器工业信息中心北京 100089
2 北京理工大学计算机学院北京 100081

发布日期:2024-06-06
通讯作者: 王栓奇(93660036@qq.com)
基金资助:
某大型工业软件研究开发项目(ZQ2020D204007)

Fuzz Testing Method of Binary Code Based on Deep Reinforcement Learning

WANG Shuanqi¹, ZHAO Jianxin², LIU Chi², WU Wei¹, LIU Zhao¹

1 Information Center of China North Industries Group Corporation,Beijing 100089,China
2 School of Computer Science,Beijing Institute of Technology,Beijing 100081,China

Published:2024-06-06
About author:WANG Shuanqi,born in 1984,Ph.D,senior engineer.His main research interests include software test verification and vulnerability mining.
Supported by:
Large-scale Industrial Software Research and Development Project(ZQ2020D204007)

摘要/Abstract

摘要： 漏洞挖掘是计算机软件安全领域的主要研究方向,其中模糊测试是重要的动态挖掘方法。为解决二进制代码漏洞挖掘中汇编代码体积庞大导致检测既困难又耗时、模糊测试效率低下等问题,提出基于深度强化学习的二进制代码模糊测试方法。首先将模糊测试过程建模为面向强化学习的多步马尔可夫决策过程,通过构建深度强化学习模型辅助模糊测试变异策略选择,实现对变异策略的动态优化。然后设计和搭建基于深度强化学习的二进制代码模糊测试平台,利用AFL实现模糊测试环境,并使用Keras-RL2库和OpenAI Gym框架实现深度强化学习算法和强化学习环境。最后通过实验分析来验证所提方法和测试平台的有效性和适用性,实验结果显示深度强化学习模型能够辅助模糊测试过程快速覆盖更多路径,能够暴露更多漏洞缺陷,显著提高二进制代码漏洞挖掘和定位的效率。

关键词: 二进制代码, 漏洞挖掘, 模糊测试, 深度强化学习, 测试平台

Abstract: Vulnerability mining is the main research direction in the field of computer software security,in which fuzz testing is an important dynamic mining method.In order to solve the problems such as time-consuming and low efficiency of fuzz testing caused by the large volume of assembly code,a novel binary code vulnerability mining technology based on deep reinforcement learning is proposed.The fuzz testing process is modeled as a multi-step Markov decision-making process oriented to reinforcement learning.The selection of fuzz testing mutation strategy is optimized by building a deep reinforcement learning model to achieve dynamic optimization.Then design and build a binary code fuzz testing platform based on deep reinforcement learning,use AFL to implement fuzz testing environment,and use Keras RL2 library and OpenAI Gym framework to implement deep reinforcement learning algorithm and reinforcement learning environment.Finally,the effectiveness and applicability of the proposed method and testing platform are verified through experimental analysis.Experimental results show that the deep reinforcement learning model can assist the fuzz testing process to quickly cover more paths,expose more vulnerabilities and defects,and significantly improve the efficiency of binary code vulnerability mining and location.

Key words: Binary code, Vulnerability mining, Fuzz testing, Deep reinforcement learning, Testing platform

中图分类号:

TP311

王栓奇, 赵健鑫, 刘驰, 武伟, 刘钊. 基于深度强化学习的二进制代码模糊测试方法[J]. 计算机科学, 2024, 51(6A): 230800078-7. https://doi.org/10.11896/jsjkx.230800078

WANG Shuanqi, ZHAO Jianxin, LIU Chi, WU Wei, LIU Zhao. Fuzz Testing Method of Binary Code Based on Deep Reinforcement Learning[J]. Computer Science, 2024, 51(6A): 230800078-7. https://doi.org/10.11896/jsjkx.230800078

参考文献

[1]WU S Z,GUO T,DONG G W,et al.Software vulnerabilityanalysis technology[M].Beijing:Science Press,2014.
[2]ZHU X D.Research on Key Issues of Binary Code SimilarityAnalysis[D].Zhengzhou:PLA Strategic Support Force Information Engineering University,2021.
[3]FLAKE H.Structural comparison of executable objects[C]//IEEE Conference on Detection of Intrusions and Malware & Vulnerability Assessment(DIMVA).2004.
[4]GAO D,REITER M K,SONG D.BinHunt:Automatically Fin-ding Semantic Differences in Binary Programs[C]//Interna-tional Conference on Information & Communications Security.2008.
[5]ZHANG X,LI Z J.Survey of Fuzz Testing Technology[J].Computer Science,2016,43(5):1-8.
[6]CADAR C,GANESH V,PAWLOWSKI M P,et al.EXE:Automatically generating inputs of death[J].ACM Transactions on Information and System Security(TISSEC).2008,12(2):1-38.
[7]CADAR C,DUNBAR D,ENGLER R D.KLEE:Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs[C]//Usenix Conference on Operating Systems Design & Implementation.2009.
[8]SHOSHITAISHVILI Y,KRUEGEL C,VIGNA G.SOK:(State of) The Art of War:Offensive Techniques in Binary Analysis[C]//2016 IEEE Symposium on Security and Privacy(SP).2016.
[9]NEWSOME J,SONG D.Dynamic taint analysis for automaticdetection,analysis,and signature generation of exploits on commodity software[J].Chinese Journal of Engineering Mathema-tics,2012,29(5):720-724.
[10]GODEFROID P,LEVIN M Y,MOLNAR D.SAGE:whitebox fuzzing for security testing[J].Queue,2012,10(3):40-44.
[11]WANG Y,JIA P,LIU L,et al.A systematic review of fuzzing based on machine learning techniques[J].arXiv:1908.01262,2019.
[12]ZHANG Z.Research on Fuzz Testing Technology Based onDDPG Reinforcement Learning Algorithm[D].Beijing:Beijing University of Posts and Telecommunications,2021.
[13]LV C,JI S,LI Y,et al.SmartSeed:Smart Seed Generation for Efficient Fuzzing[J].arXiv:1807.02606,2018.
[14]WU Z,JOHNSON E,YANG W,et al.REINAM:reinforcement learning for input-grammar inference[C]//The 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering.2019.
[15]SHE D,KRISHNA R,YAN L,et al.MTFuzz:Fuzzing with aMulti-task Neural Network[C]//ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering(ESEC/FSE).2020.
[16]SON S,LEE S,HAN H,et al.Montage:A Neural NetworkLanguage Model-Guided JavaScript Engine Fuzzer[C]//20th USENIX Security Symposium(USENIX Security 2020).2020
[17]ZONG P,LV T,WANG D,et al.FuzzGuard:Filtering out Unreachable Inputs in Directed Grey-box Fuzzing through Deep Learning[C]//29th USENIX Security Symposium.2020.
[18]BOTTINGER K,GODEFROID P,SINGH R.Deep Reinforce-ment Fuzzing[C]//2018 IEEE Security and Privacy Workshops.2018.
[19]DROZD W,WAGNER M D.FuzzerGym:A Competitive Framework for Fuzzing and Learning[J].arXiv:1807.07490,2018.
[20]DOLAN-GAVITT B,HULIN P,KIRDA E,et al.Lava:Large-scale automated vulnerability addition[C]//2016 IEEE Symposium on Security and Privacy.2016.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于深度强化学习的二进制代码模糊测试方法

Fuzz Testing Method of Binary Code Based on Deep Reinforcement Learning

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 0

Metrics

本文评价

推荐阅读 0