计算机科学 ›› 2023, Vol. 50 ›› Issue (7): 10-17.doi: 10.11896/jsjkx.220700128

• 计算机软件 • 上一篇    下一篇

针对缺陷根源定位的测试用例生成技术

杜昊, 王允超, 燕宸毓, 李星玮   

  1. 信息工程大学数学工程与先进计算国家重点实验室 郑州 450001
  • 收稿日期:2022-07-12 修回日期:2022-11-04 出版日期:2023-07-15 发布日期:2023-07-05
  • 通讯作者: 王允超(w_yunchao@sina.com)
  • 作者简介:(504224090@qq.com)
  • 基金资助:
    国家重点研发计划(2019QY0501)

Test Cases Generation Techniques for Root Cause Location of Fault

DU Hao, WANG Yunchao, YAN Chenyu, LI Xingwei   

  1. State Key Laboratory of Mathematical Engineering and Advanced Computing,Information Engineering University,Zhengzhou 450001,China
  • Received:2022-07-12 Revised:2022-11-04 Online:2023-07-15 Published:2023-07-05
  • About author:DU Hao,born in 1997,postgraduate.His main research interests include reverse engineering and vulnerability mi-ning.WANG Yunchao,born in 1992,Ph.D.His main research interests include reverse engineering and vulnerability mi-ning and exploitation.
  • Supported by:
    National Key Research and Development Program of China(2019QY0501).

摘要: 缺陷根源定位是软件调试的重要阶段,基于频谱的缺陷根源定位方法是软件自动化调试研究中的热点问题,但其定位效果很大程度上取决于测试用例的质量。不同类型软件的测试输入通用性差,随机生成的测试输入则存在过拟合或混杂项过多的问题,导致分析结果误差较大,致使目前该类技术的应用场景有限。针对测试用例生成问题,提出了基于崩溃路径的分阶段探索方法Dgenerate,并实现原型工具Dloc。首先利用二进制插桩手段在程序执行输入阶段于基本块中插桩路径信息,根据此信息将原始测试输入划分为普通型和引导型;然后利用动态能量调度算法探索崩溃相关路径生成高质量的测试用例;最后在原始程序中执行测试用例并追踪执行时信息,通过统计分析的方法有效地定位到程序缺陷根源的位置。文中选取了6个不同类型软件中的15个真实CVE漏洞进行实验,结果显示Dloc生成的测试用例与已有技术相比可以将定位效率平均提升75%,并且Dloc能够以87%的准确性在评分前五的位置中输出缺陷根源相关代码片段,验证了所提方法系统的可行性和实用性。

关键词: 根源定位, 测试用例, 程序频谱, 统计分析, 定向模糊测试

Abstract: Vulnerability root cause localization is an important stage of software debugging,and spectrum-based fault root cause localization method is a hot issue in software automation debugging research,but the effectiveness of the positioning depends to a large extent on the quality of the test cases.Test inputs of different types of software are poorly generalized,and randomly gene-rated test inputs lead to large errors in analysis results due to overfitting or too many confounding items,resulting in limited application scenarios of such techniques at present.In this paper,we propose a phased exploration method Dgenerate based on crash paths to address the test case generation problem and implement the prototype tool Dloc.First,we use binary staking to insert staking path information in the basic block during the input stage of program execution,and then classify the original test inputs into common and guided types based on this information.Then,we use the dynamic energy scheduling algorithm to explore crash-related paths to generate high-quality test cases.Finally,the test cases are executed in the original program and execution information is traced to effectively locate the root cause of program fault through statistical analysis.In this paper,15 real CVE vulnerabilities in six different types of software are selected for experiments,and the results show that test cases generated by Dloc can improve location efficiency by 75% on average compared to previous techniques,and Dloc can output the code fragments related to the root causes of defect in the top five positions with an accuracy of 87%,which verifies the feasibility and practicality of the method system in this paper.

Key words: Root cause location, Test case, Program spectrum, Statistical analysis, Directed greybox fuzzing

中图分类号: 

  • TP311
[1]SASAKI Y,HIGO Y,MATSUMOTO S,et al.SBFL-Suitabili-ty:A Software Characteristic for Fault Localization[C]//IEEE International Conference on Software Maintenance and Evolution(ICSME).2020:702-706.
[2]ERIC W,GAO R Z,LI Y H,et al.A Survey on Software Fault Localization[C]//IEEE Transactions on Software Engineering.2016:707-740.
[3]XIE X,CHEN T Y,KUO F C,et al.A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization[J].ACM Transactions on Software Engineering and Me-thodology(TOSEM),2013,22(4):1-40.
[4]HIGOR A,MARCOS L,FABIO K.Spectrum-based softwarefault localization:A survey of techniques,advances,and challenges[J].arXiv:1607.04347,2016.
[5]WEN W Z,LI B X,SUN X B,et al.Technique of software fault localization based on hierarchical slicing spectrum[J].Journal of Software,2013,24(5):977-992.
[6]LIU C,YAN X,FEI L,et al.SOBER:statistical model-basedbug localization[J].ACM SIGSOFT Software Engineering Notes,2005,30(5):286-295.
[7]JONES J A,HARROLD M J.Empirical evaluation of the tarantula automatic fault-localization technique[C]//Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering.2005:273-282.
[8]ABREU R,ZOETEWEIJ P,VAN GEMUND A J C.On the accuracy of spectrum-based fault localization[C]//Testing:Academic and Industrial Conference Practice and Research Techniques-Mttation(TAIC PART-Mutation 2007).IEEE,2007:89-98.
[9]WONG W E,DEBROY V,GAO R,et al.The DStar method for effective software fault localization[J].IEEE Transactions on Reliability,2013,63(1):290-308.
[10]ARTZI S,DOLBY J,TIP F,et al.Directed test generation foreffective fault localization[C]//Proceedings of the 19th International Symposium on Software Testing and Analysis.2010:49-60.
[11]SCHWARTZ E J,AVGERINOS T,BRUMLEY D.All you ever wanted to know about dynamic taint analysis and forward symbolic execution(but might have been afraid to ask)[C]//2010 IEEE Symposium on Security and Privacy.IEEE,2010:317-331.
[12]BLAZYTKO T,SCHLÖGEL M,ASCHERMANN C,et al.AURORA:Statistical Crash Analysis for Automated Root Cause Explanation[C]//29th USENIX Security Symposium(USENIX Security 20).2020:235-252.
[13]MICHAEL Z.afl-fuzz:crash exploration mode[EB/OL].https://lcamtuf.blogspot.com/2014/11/afl-fuzz-crash-exploration-mode.html.
[14]SHEN S Q,KOLLURI A,DONG Z,et al.Localizing Vulnerabilities Statistically From One Exploit[C]//Proceedings of the ACM Asia Conference on Computer and Communications Secu-rity.2021:537-549.
[15]MICHAŁ Z.American fuzzy lop[EB/OL].http://lcamtuf.coredump.cx/afl/.
[16]BÖHME M,PHAM V T,NGUYEN M D,et al.Directed greybox fuzzing[C]//Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security.2017:2329-2344.
[17]LYU C,JI S,ZHANG C,et al.MOPT:Optimized mutationscheduling for fuzzers[C]//28th USENIX Security Symposium(USENIX Security 19).2019:1949-1966.
[18]Intel Corporation.Pin-a dynamic binary instrumentation tool[EB/OL].https://software.intel.com/en-us/articles/pin-a-dynamic-binary-instrumentation-tool.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!