漏洞基准测试集构建技术综述

doi:10.11896/jsjkx.230300209

摘要/Abstract

摘要： 随着软件漏洞分析技术的发展,针对不同漏洞的发现技术和工具被广泛使用。但是如何评价不同技术、方法、工具的能力边界是当前该领域未解决的基础性难题。而构建用于能力评估的漏洞基准测试集(Vulnerability Benchmark)是解决该基础性难题的关键。文中梳理了近20年漏洞基准测试集构建的相关代表性成果。首先从自动化的角度阐述了基准测试集的发展历程;然后对基准测试集构建技术进行了分类,给出了基准测试集构建的通用流程模型,并阐述了不同测试集构建方法的思想、流程以及存在的不足;最后总结当前研究的局限性,并对下一步研究进行了展望。

关键词: 漏洞基准测试集, 软件漏洞分析, 评估指标

Abstract: The development of technology for software vulnerability analysis has led to the widespread use of various techniques and tools for discovering vulnerabilities.Nevertheless,assessing the capability boundary of these techniques,methods,and tools remains a fundamental problem in this field.A vulnerability benchmark for capability assessment plays a pivotal role in solving this problem.The purpose of this paper is to review representative results related to the construction of benchmark test sets over the past 20 years.Firstly,it explains the developmental history of vulnerability benchmark from an automation perspective.Then,it classifies the techniques for constructing vulnerability benchmark and provide a general process model,explaining the ideas and processes of different construction methods and their limitations.Lastly,the limitations of current research are summarized and the future research is prospected.

Key words: Vulnerability benchmark, Software vulnerability analysis, Evaluation metrics

中图分类号:

TP309

马总帅, 武泽慧, 燕宸毓, 魏强. 漏洞基准测试集构建技术综述[J]. 计算机科学, 2024, 51(1): 316-326. https://doi.org/10.11896/jsjkx.230300209

MA Zongshuai, WU Zehui, YAN Chenyu, WEI Qiang. Survey of Vulnerability Benchmark Construction Technique[J]. Computer Science, 2024, 51(1): 316-326. https://doi.org/10.11896/jsjkx.230300209

参考文献

[1]MANES V J M,HAN H S,HAN C,et al.Fuzzing:Art,science,and engineering[J].arXiv:1812.00140,2018.
[2]BUNDT J,FASANO A,DOLAN-GAVITT B,et al.Evaluating synthetic bugs[C]//Asia Conference on Computer and Communications Security.2021:716-730.
[3]HAZIMEH A,HERRERA A,PAYER M.Magma:A ground-truth fuzzing benchmark[J].ACM on Measurement and Analysis of Computing Systems,2020,4(3):1-29.
[4]WILANDER J,KAMKAR M.A comparison of publicly available tools for static intrusion prevention[C]//Nordic Workshop on Secure IT Systems(NordSec).2002:108.
[5]ZITSER M,LIPPMANN R,LEEK T.Testing static analysistools using exploitable buffer overflows from open source code[C]//ACM SIGSOFT Twelfth International Symposium on Foundations of Software Engineering.2004:97-106.
[6]BLACK P E.Software Assurance Metrics and Tool Evaluation[C]//Software Engineering Research and Practice.2005:829-835.
[7]SHRESTHA J.Static Program Analysis[D].Uppsala:UppsalaUniversity,2013.
[8]BLACK P E,BLACK P E.Juliet 1.3 test suite:Changes from1.2[M].US Department of Commerce,National Institute of Standards and Technology,2018.
[9]BLACK P E.A software assurance reference dataset:Thousands of programs with known bugs[J].Journal of research of the National Institute of Standards and Technology,2018,123:1-3.
[10]THOMPSON M F,VIDAS T.Cyber Grand Challenge(CGC) monitor:A vetting system for the DARPA cyber grand challenge[J].Digital Investigation,2018,26:S127-S135.
[11]DOLAN-GAVITT B,HULIN P,KIRDA E,et al.Lava:Large-scale automated vulnerability addition[C]//2016 IEEE Sympo-sium on Security and Privacy(SP).IEEE,2016:110-121.
[12]PEWNY J,HOLZ T.EvilCoder:automated bug insertion[C]//Annual Conference on Computer Security Applications.2016:214-225.
[13]XU H,ZHAO Z,ZHOU Y,et al.Benchmarking the capability ofsymbolic execution tools with logic bombs[J].IEEE Transactions on Dependable and Secure Computing,2018,17(6):1243-1256.
[14]HULIN P,DAVIS A,SRIDHAR R,et al.AutoCTF:Creating Diverse Pwnables via Automated Bug Injection[C]//WOOT.2017.
[15]ROY S,PANDEY A,DOLAN-GAVITT B,et al.Bug synthesis:Challenging bug-finding tools with deep faults[C]//European Software Engineering Conference and Symposium on the Foundations of Software Engineering.2018:224-234.
[16]SRIDHAR R.Adding diversity and realism to LAVA,a vulnerability addition system[D].Massachusetts Institute of Technology,2018.
[17]YANG J,ZHOU P,NI Y.ASVG:Automated Software Vulnerability Sample Generation Technology Based on Source Code[C]//Broadband and Wireless Computing,Communication and Applications(BWCCA-2018).Springer International Publi-shing,2019:316-325.
[18]FASANO A,LEEK T,DOLAN-GAVITT B,et al.The rodeday to less-buggy programs[J].IEEE Security & Privacy,2019,17(6):84-88.
[19]KASHYAP V,RUCHTI J,KOT L,et al.Automated custo-mized bug-benchmark generation[C]//International Working Conference on Source Code Analysis and Manipulation(SCAM).IEEE,2019:103-114.
[20]OpenStack[EB/OL].(2017-11-26)[2021-08-17].http://www.openstack.org/.
[21]COTRONEO D,DE SIMONE L,LIGUORI P,et al.How bad can a bug get? an empirical analysis of software failures in the openstack cloud computing platform[C]//ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering.2019:200-211.
[22]MARQUES H,LARANJEIRO N,BERNARDINO J.Injectingsoftware faults in Python applications:The OpenStack case study[J].Empirical Software Engineering,2022,27(1):20.
[23]MACHIRY A,REDINI N,GUSTAFSON E,et al.Towards automatically generating a sound and complete dataset for evaluating static analysis tools[C]//Workshop on Binary Analysis Research(BAR).2019.
[24]GHALEB A,PATTABIRAMAN K.How effective are smartcontract analysis tools? evaluating smart contract static analysis tools using bug injection[C]//ACM SIGSOFT International Symposium on Software Testing and Analysis.2020:415-427.
[25]BUTERIN V.A next-generation smart contract and decentra-lized application platform[J].White Paper,2014,3(37):2-1.
[26]LIANG H,LI M,WANG J.Automated data race bugs addition[C]//European Workshop on Systems Security.2020:37-42.
[27]METZMAN J,SZEKERES L,SIMON L,et al.Fuzzbench:an open fuzzer benchmarking platform and service[C]//ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering.2021:1393-1403.
[28]SEREBRYANY K.OSS-Fuzz-Google’s continuous fuzzing ser-vice for open source software[C]//USENIX Security Sympo-sium.USENIX Association,2017.
[29]LI Y,JI S,CHEN Y,et al.UNIFUZZ:A Holistic and Pragmatic Metrics-Driven Platform for Evaluating Fuzzers[C]//USENIX Security Symposium.2021:2777-2794.
[30]PRAKASH R K,VASUDEVAN I,INDHUJA I,et al.Hardi-ness sensing for susceptibility using American fuzzy lop[C]//ITM Web of Conferences.EDP Sciences,2021,37:01003.
[31]SHE D,PEI K,EPSTEIN D,et al.Neuzz:Efficient fuzzing with neural program smoothing[C]//2019 IEEE Symposium on Security and Privacy(SP).IEEE,2019:803-817.
[32]YUN I,LEE S,XU M,et al.{QSYM}:A practical concolic execution engine tailored for hybrid fuzzing[C]//27th {USENIX} Security Symposium({USENIX} Security 18).2018:745-761.
[33]RAWAT S,JAIN V,KUMAR A,et al.Vuzzer:Application-aware evolutionary fuzzing[C]//NDSS.2017,17:1-14.
[34]LYU C,JI S,ZHANG C,et al.MOPT:Optimized MutationScheduling for Fuzzers[C]//USENIX Security Symposium.2019:1949-1966.
[35]NATELLA R,PHAM V T.Profuzzbench:A benchmark forstateful protocol fuzzing[C]//ACM SIGSOFT International Symposium on Software Testing And analysis.2021:662-665.
[36]PATRA J,PRADEL M.Semantic bug seeding:a learning-based approach for creating realistic bugs[C]//ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering.2021:906-918.
[37]ZHANG Z,PATTERSON Z,HICKS M,et al.{FIXREVERTER}:A Realistic Bug Injection Methodology for Benchmarking Fuzz Testing[C]//31st USENIX Security Symposium(USENIX Security 22).2022:3699-3715.
[38]JIANG Z,LI R,TANG C.BugAnaBench:benchmark for software vulnerability analysis and its construction method[C]//Second International Symposium on Computer Technology and Information Science(ISCTIS 2022).SPIE,2022:40-45.
[39]YANG X,CHEN Y,EIDE E,et al.Finding and understanding bugs in C compilers[C]//ACM SIGPLAN Conference on Programming Language Design and Implementation.2011:283-294.
[40]MCKEEMAN W M.Differential testing for software[J].Digital Technical Journal,1998,10(1):100-107.
[41]MARCOZZI M,TANG Q,DONALDSON A F,et al.Compiler fuzzing:How much does it matter?[C]//Proceedings of the ACM on Programming Languages.2019,3(OOPSLA):1-29.
[42]EVEN-MENDOZA K,CADAR C,DONALDSON A F.Closerto the edge:Testing compilers more thoroughly by being less conservative about undefined behaviour[C]//IEEE/ACM International Conference on Automated Software Engineering.2020:1219-1223.
[43]POP A,POP S,JAGASIA H,et al.Improving GNU compilercollection infrastructure for streamization[C]//Proceedings of the 2008 GCC Developers’ Summit.2008:77-86.
[44]LATTNER C,ADVE V.LLVM:A compilation framework for lifelong program analysis & transformation[C]//International Symposium on Code Generation and Optimization,2004(CGO 2004).IEEE,2004:75-86.
[45]KAPUS T,CADAR C.Automatic testing of symbolic execution engines via program generation and differential testing[C]//2017 32nd IEEE/ACM International Conference on Automated Software Engineering(ASE).IEEE,2017:590-600.
[46]FELICI R,POZZI L,FURIA C A.HyperPUT:Generating Synthetic Faulty Programs to Challenge Bug-Finding Tools[J].ar-Xiv:2209.06615,2022.
[47]LEE H,KIM S,CHA S K.Fuzzle:Making a Puzzle for Fuzzers[C]//37th IEEE/ACM International Conference on Automated Software Engineering.2022:1-12.
[48]SHAH M S H,MOHITE M J M,MUSALE A G,et al.Survey paper on maze generation algorithms for puzzle solving games[J].International Journal of Scientific & Engineering Research,2017,8(2):1064-1067.
[49]YAMAGUCHI F,GOLDE N,ARP D,et al.Modeling and discovering vulnerabilities with code property graphs[C]//2014 IEEE Symposium on Security and Privacy.IEEE,2014:590-604.
[50]MARTIN R A,BARNUM S.Common weakness enumeration(cwe) status update[J].ACM SIGAda Ada Letters,2008,28(1):88-91.
[51]NIST Software Assurance Reference Dataset Project[EB/OL].https://samate.nist.gov/SRD/.
[52]LI Y,CHEN B,CHANDRAMOHAN M,et al.Steelix:program-state based binary fuzzing[C]//Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering.2017:627-637.
[53]STEPHENS N,GROSEN J,SALLS C,et al.Driller:Augmen-ting fuzzing through selective symbolic execution[C]//NDSS.2016,16(2016):1-16.
[54]CADAR C,DUNBAR D,ENGLER D R.Klee:unassisted andautomatic generation of high-coverage tests for complex systems programs[C]//OSDI.2008,8:209-224.
[55]SAUDEL F,SALWAN J.Triton:Concolic execution framework[C]//Symposium Sur La sécurité Des Technologies De l’Information ET DES Communications(SSTIC).2015.
[56]SPRINGER J,FENG W.Teaching with angr:A symbolic execution curriculum and {CTF}[C]//2018 {USENIX} Workshop on Advances in Security Education({ASE} 18).2018.
[57]DOLAN-GAVITT B,HODOSH J,HULIN P,et al.Repeatable reverse engineering with PANDA[C]//Program Protection and Reverse Engineering Workshop.2015:1-11.
[58]DOLAN-GAVITT B,HODOSH J,HULIN P,et al.Repeatable reverse engineering with PANDA[C]//Program Protection and Reverse Engineering Workshop.2015:1-11.
[59]GOPINATH R,GÖRZ P,GROCE A.Mutation analysis:An-swering the fuzzing challenge[J].arXiv:2201.11303,2022.
[60]ZHIVICH M,LEEK T,LIPPMANN R.Dynamic buffer over-flow detection[C]//Workshop on the Evaluation of Software Defect Detection Tools.2005.
[61]LUK C K,COHN R,MUTH R,et al.Pin:building customized program analysis tools with dynamic instrumentation[J].ACM Sigplan Notices,2005,40(6):190-200.
[62]KLEES G,RUEF A,COOPER B,et al.Evaluating fuzz testing[C]//ACM SIGSAC Conference on Computer and Communications Security.2018:2123-2138.
[63]GAO X,MECHTAEV S,Crash-avoiding program repair[C]//Proceedings of the 28th ACM SIGSOFT International Sympo-sium on Software Testing and Analysis.2019:8-18.
[64]WANG J,DUAN Y,SONG W,et al.Be sensitive and collaborative:Analyzing impact of coverage metrics in greybox fuzzing[C]//Research in Attacks,Intrusions and Defenses(RAID’19).2019.
[65]CADAR C,DUNBAR D,ENGLER D R.Klee:unassisted andautomatic generation of high-coverage tests for complex systems programs[C]//OSDI.2008,8:209-224.
[66]LIN G,WEN S,HAN Q L,et al.Software vulnerability detection using deep neural networks:a survey[J].Proceedings of the IEEE,2020,108(10):1825-1848.
[67]BÖHME M,CADAR C,ROYCHOUDHURY A.Fuzzing:Challenges and reflections[J].IEEE Software,2020,38(3):79-86.
[68]BÖHME M,PHAM V T,NGUYEN M D,et al.Directed greybox fuzzing[C]//ACM SIGSAC Conference on Computer and Communications Security.2017:2329-2344.
[69]WANG P,ZHOU X,LU K,et al.The progress,challenges,and perspectives of directed greybox fuzzing[J].arXiv:2005.11907,2020.
[70]ZHANG Z,CHEN L,WEI H,et al.Binary-level Directed Symbolic Execution Through Pattern Learning[C]//2022 IEEE International Conference on Parallel & Distributed Processing with Applications,Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking(ISPA/BDCloud/SocialCom/SustainCom).IEEE,2022:50-57.
[71]BALDONI R,COPPA E,D’ELIA D C,ET AL.et al.A survey of symbolic execution techniques[J].ACM Computing Surveys(CSUR),2018,51(3):1-39.
[72]ZHANG T,JIANG Y,GUO R,et al.A survey of hybrid fuzzing based on symbolic execution[C]//Proceedings of the 2020 International Conference on Cyberspace Innovation of Advanced Technologies.2020:192-196.
[73]ZHU X,WEN S,CAMTEPE S,et al.Fuzzing: a survey for roadmap[J]. ACM Computing Surveys(CSUR),2022,54(11S):1-36.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed