基于深度强化学习的二进制代码模糊测试方法

doi:10.11896/jsjkx.230800078

Abstract

Abstract: Vulnerability mining is the main research direction in the field of computer software security,in which fuzz testing is an important dynamic mining method.In order to solve the problems such as time-consuming and low efficiency of fuzz testing caused by the large volume of assembly code,a novel binary code vulnerability mining technology based on deep reinforcement learning is proposed.The fuzz testing process is modeled as a multi-step Markov decision-making process oriented to reinforcement learning.The selection of fuzz testing mutation strategy is optimized by building a deep reinforcement learning model to achieve dynamic optimization.Then design and build a binary code fuzz testing platform based on deep reinforcement learning,use AFL to implement fuzz testing environment,and use Keras RL2 library and OpenAI Gym framework to implement deep reinforcement learning algorithm and reinforcement learning environment.Finally,the effectiveness and applicability of the proposed method and testing platform are verified through experimental analysis.Experimental results show that the deep reinforcement learning model can assist the fuzz testing process to quickly cover more paths,expose more vulnerabilities and defects,and significantly improve the efficiency of binary code vulnerability mining and location.

Key words: Binary code, Vulnerability mining, Fuzz testing, Deep reinforcement learning, Testing platform

CLC Number:

TP311

WANG Shuanqi, ZHAO Jianxin, LIU Chi, WU Wei, LIU Zhao. Fuzz Testing Method of Binary Code Based on Deep Reinforcement Learning[J].Computer Science, 2024, 51(6A): 230800078-7.

References

[1]WU S Z,GUO T,DONG G W,et al.Software vulnerabilityanalysis technology[M].Beijing:Science Press,2014.
[2]ZHU X D.Research on Key Issues of Binary Code SimilarityAnalysis[D].Zhengzhou:PLA Strategic Support Force Information Engineering University,2021.
[3]FLAKE H.Structural comparison of executable objects[C]//IEEE Conference on Detection of Intrusions and Malware & Vulnerability Assessment(DIMVA).2004.
[4]GAO D,REITER M K,SONG D.BinHunt:Automatically Fin-ding Semantic Differences in Binary Programs[C]//Interna-tional Conference on Information & Communications Security.2008.
[5]ZHANG X,LI Z J.Survey of Fuzz Testing Technology[J].Computer Science,2016,43(5):1-8.
[6]CADAR C,GANESH V,PAWLOWSKI M P,et al.EXE:Automatically generating inputs of death[J].ACM Transactions on Information and System Security(TISSEC).2008,12(2):1-38.
[7]CADAR C,DUNBAR D,ENGLER R D.KLEE:Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs[C]//Usenix Conference on Operating Systems Design & Implementation.2009.
[8]SHOSHITAISHVILI Y,KRUEGEL C,VIGNA G.SOK:(State of) The Art of War:Offensive Techniques in Binary Analysis[C]//2016 IEEE Symposium on Security and Privacy(SP).2016.
[9]NEWSOME J,SONG D.Dynamic taint analysis for automaticdetection,analysis,and signature generation of exploits on commodity software[J].Chinese Journal of Engineering Mathema-tics,2012,29(5):720-724.
[10]GODEFROID P,LEVIN M Y,MOLNAR D.SAGE:whitebox fuzzing for security testing[J].Queue,2012,10(3):40-44.
[11]WANG Y,JIA P,LIU L,et al.A systematic review of fuzzing based on machine learning techniques[J].arXiv:1908.01262,2019.
[12]ZHANG Z.Research on Fuzz Testing Technology Based onDDPG Reinforcement Learning Algorithm[D].Beijing:Beijing University of Posts and Telecommunications,2021.
[13]LV C,JI S,LI Y,et al.SmartSeed:Smart Seed Generation for Efficient Fuzzing[J].arXiv:1807.02606,2018.
[14]WU Z,JOHNSON E,YANG W,et al.REINAM:reinforcement learning for input-grammar inference[C]//The 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering.2019.
[15]SHE D,KRISHNA R,YAN L,et al.MTFuzz:Fuzzing with aMulti-task Neural Network[C]//ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering(ESEC/FSE).2020.
[16]SON S,LEE S,HAN H,et al.Montage:A Neural NetworkLanguage Model-Guided JavaScript Engine Fuzzer[C]//20th USENIX Security Symposium(USENIX Security 2020).2020
[17]ZONG P,LV T,WANG D,et al.FuzzGuard:Filtering out Unreachable Inputs in Directed Grey-box Fuzzing through Deep Learning[C]//29th USENIX Security Symposium.2020.
[18]BOTTINGER K,GODEFROID P,SINGH R.Deep Reinforce-ment Fuzzing[C]//2018 IEEE Security and Privacy Workshops.2018.
[19]DROZD W,WAGNER M D.FuzzerGym:A Competitive Framework for Fuzzing and Learning[J].arXiv:1807.07490,2018.
[20]DOLAN-GAVITT B,HULIN P,KIRDA E,et al.Lava:Large-scale automated vulnerability addition[C]//2016 IEEE Symposium on Security and Privacy.2016.

Related Articles 15

[1]	GAO Yuzhao, NIE Yiming. Survey of Multi-agent Deep Reinforcement Learning Based on Value Function Factorization [J]. Computer Science, 2024, 51(6A): 230300170-9.
[2]	LI Danyang, WU Liangji, LIU Hui, JIANG Jingqing. Deep Reinforcement Learning Based Thermal Awareness Energy Consumption OptimizationMethod for Data Centers [J]. Computer Science, 2024, 51(6A): 230500109-8.
[3]	YANG Xiuwen, CUI Yunhe, QIAN Qing, GUO Chun, SHEN Guowei. COURIER:Edge Computing Task Scheduling and Offloading Method Based on Non-preemptivePriorities Queuing and Prioritized Experience Replay DRL [J]. Computer Science, 2024, 51(5): 293-305.
[4]	YAN Yintong, YU Lu, WANG Taiyan, LI Yuwei, PAN Zulie. Study on Binary Code Similarity Detection Based on Jump-SBERT [J]. Computer Science, 2024, 51(5): 355-362.
[5]	LI Junwei, LIU Quan, XU Yapeng. Option-Critic Algorithm Based on Mutual Information Optimization [J]. Computer Science, 2024, 51(2): 252-258.
[6]	SHI Dianxi, PENG Yingxuan, YANG Huanhuan, OUYANG Qianying, ZHANG Yuhui, HAO Feng. DQN-based Multi-agent Motion Planning Method with Deep Reinforcement Learning [J]. Computer Science, 2024, 51(2): 268-277.
[7]	ZHAO Xiaoyan, ZHAO Bin, ZHANG Junna, YUAN Peiyan. Study on Cache-oriented Dynamic Collaborative Task Migration Technology [J]. Computer Science, 2024, 51(2): 300-310.
[8]	ZHUANG Yuan, CAO Wenfang, SUN Guokai, SUN Jianguo, SHEN Linshan, YOU Yang, WANG Xiaopeng, ZHANG Yunhai. Network Protocol Vulnerability Mining Method Based on the Combination of Generative AdversarialNetwork and Mutation Strategy [J]. Computer Science, 2023, 50(9): 44-51.
[9]	LIU Xingguang, ZHOU Li, ZHANG Xiaoying, CHEN Haitao, ZHAO Haitao, WEI Jibo. Edge Intelligent Sensing Based UAV Space Trajectory Planning Method [J]. Computer Science, 2023, 50(9): 311-317.
[10]	LIN Xinyu, YAO Zewei, HU Shengxi, CHEN Zheyi, CHEN Xing. Task Offloading Algorithm Based on Federated Deep Reinforcement Learning for Internet of Vehicles [J]. Computer Science, 2023, 50(9): 347-356.
[11]	JIN Tiancheng, DOU Liang, ZHANG Wei, XIAO Chunyun, LIU Feng, ZHOU Aimin. OJ Exercise Recommendation Model Based on Deep Reinforcement Learning and Program Analysis [J]. Computer Science, 2023, 50(8): 58-67.
[12]	XIONG Liqin, CAO Lei, CHEN Xiliang, LAI Jun. Value Factorization Method Based on State Estimation [J]. Computer Science, 2023, 50(8): 202-208.
[13]	WANG Hanmo, ZHENG Shijie, XU Ruonan, GUO Bin, WU Lei. Self Reconfiguration Algorithm of Modular Robot Based on Swarm Agent Deep Reinforcement Learning [J]. Computer Science, 2023, 50(6): 266-273.
[14]	GU Shouke, CHEN Wen. Function Level Code Vulnerability Detection Method of Graph Neural Network Based on Extended AST [J]. Computer Science, 2023, 50(6): 283-290.
[15]	ZHANG Qiyang, CHEN Xiliang, CAO Lei, LAI Jun, SHENG Lei. Survey on Knowledge Transfer Method in Deep Reinforcement Learning [J]. Computer Science, 2023, 50(5): 201-216.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Fuzz Testing Method of Binary Code Based on Deep Reinforcement Learning

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0