计算机科学 ›› 2022, Vol. 49 ›› Issue (11): 39-48.doi: 10.11896/jsjkx.220200086
张大林1, 张哲玮2, 王楠1, 刘吉强1
ZHANG Da-lin1, ZHANG Zhe-wei2, WANG Nan1, LIU Ji-qiang1
摘要: 测试用例自动生成技术旨在降低测试成本,与人工生成测试用例相比,它具有更高的测试效率。现有主流的测试工具对软件中的所有文件都平等对待,但是大多数情况下含有缺陷的文件只占整个软件项目的一小部分。因此,如果测试人员能针对更易存在缺陷的文件进行测试,就能极大地节省测试资源。针对以上问题,文中设计了一种基于主动学习的预测引导的自动化测试工具AutoUnit。首先对待测文件池中的所有文件进行缺陷预测,然后对最“可疑”的文件进行测试用例生成,之后将实际测试用例执行结果反馈给缺陷预测模型并更新该预测模型,最后根据召回率判断是否进入下一轮测试。此外,AutoUnit还能在含缺陷文件总数未知时,通过设置不同的目标召回率来及时停止预测引导。它能依据已测文件来预测含缺陷文件总数并计算当前召回率,判断是否停止预测引导,保证测试效率。实验分析表明,当测得相同数量的缺陷文件时,AutoUnit花费的最短时间为目前主流测试工具的70.9%,最长时间为目前主流测试工具的80.7%;当含缺陷文件总数未知且目标召回率设置为95%时,与最新版本的Evosuite相比,AutoUnit只需要检查29.7%的源代码文件就能达到相同的检测水平,且其测试时间仅为Evosuite的34.6%,极大地降低了测试成本。实验结果表明,该方法有效地提高了测试的效率。
中图分类号:
[1]SHIN Y,WILLIAMS L.Can traditional fault prediction models be used for vulnerability prediction?[J].Empirical Software Engineering,2013,18(1):25-59. [2]MAGGIO M,HOFFMANN H,SANTAMBROGIO M D,et al.Controlling software applications via resource allocation within the heartbeats framework[C]//49th IEEE Conference on Decision and Control(CDC).IEEE,2010:3736-3741. [3]WANG S,LIU T,NAM J,et al.Deep semantic feature learning for software defect prediction[J].IEEE Transactions on Software Engineering,2018,46(12):1267-1293. [4]IQBAL A,AFTAB S,ALI U,et al.Performance analysis of machine learning techniques on software defect prediction using NASA datasets[J].International Journal of Advanced Computer Science and Applications,2019,10(5):300-308. [5]THOTA M K,SHAJIN F H,RAJESH P.Survey on software defect prediction techniques[J].International Journal of Applied Science and Engineering,2020,17(4):331-344. [6]YU G,CHEN X,DOMENICONI C,et al.Cmal:Cost-effective multi-label active learning by querying subexamples[J].IEEE Transactions on Knowledge and Data Engineering,2020,34(5):2091-2105. [7]PACHECO C,LAHIRI S K,ERNST M D,et al.Feedback-directed random test generation[C]//29th International Confe-rence on Software Engineering(ICSE’07).IEEE,2007:75-84. [8]MY H L T,THANH B N,THANH T K.Survey on mutation-based test data generation[J].International Journal of Electrical and Computer Engineering,2015,5(5):1164-1173. [9]MCMINN P.Search-based software test data generation:a survey[J].Software Testing,Verification and Reliability,2004,14(2):105-156. [10]BALDONI R,COPPA E,D’ELIA D C,et al.A survey of symbolic execution techniques[J].ACM Computing Surveys (CSUR),2018,51(3):1-39. [11]GODEFROID P,LEVIN M Y,MOLNAR D A.Automatedwhitebox fuzz testing[C]//NDSS.2008,8:151-166. [12]HALLER I,SLOWINSKA A,NEUGSCHWANDTNER M,et al.Dowsing for Overfıows:A Guided Fuzzer to Find Buffer Boundary Violations[C]//22nd {USENIX} Security Sympo-sium({USENIX} Security 13).2013:49-64. [13]GODEFROID P,KLARLUND N,SEN K.DART:Directed automated random testing[C]//Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation.2005:213-223. [14]PANDITA R,XIE T,TILLMANN N,et al.Guided test generation for coverage criteria[C]//2010 IEEE International Confe-rence on Software Maintenance.IEEE,2010:1-10. [15]SHIN Y,MENEELY A,WILLIAMS L,et al.Evaluating complexity,code churn,and developer activity metrics as indicators of software vulnerabilities[J].IEEE Transactions on Software Engineering,2010,37(6):772-787. [16]LI Y,JI S,LV C,et al.V-fuzz:Vulnerability-oriented evolutio-nary fuzzing[J].arXiv:1901.01142,2019. [17]PERERA A,ALETI A,BÖHME M,et al.Defect predictionguided search-based software testing[C]//2020 35th IEEE/ACM International Conference on Automated Software Engineering(ASE).IEEE,2020:448-460. [18]SETTLES B.Active learning literature survey[D].Madison:University of Wisconsin-Madison,2019. [19]VIJAYANARASIMHAN S,GRAUMAN K.Large-scale liveactive learning:Training object detectors with crawled data and crowds[J].International Journal of Computer Vision,2014,108(1):97-114. [20]CASSEL S,HOWAR F,JONSSON B,et al.Active learning for extended finite state machines[J].Formal Aspects of Computing,2016,28(2):233-263. [21]CHU W,ZINKEVICH M,LI L,et al.Unbiased online active learning in data streams[C]//Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2011:195-203. [22]MOHAMAD S,SAYED-MOUCHAWEH M,BOUCHACHIA A.Active learning for classifying data streams with unknown number of classes[J].Neural Networks,2018,98:1-15. [23]LIU W,ZHANG H,DING Z,et al.A comprehensive activelearning method for multiclass imbalanced data streams with concept drift[J/OL].Knowledge-Based Systems,2021,215.https://www.sciencedirect.com/science/article/pii/S0950705121000411. [24]SINHA S,EBRAHIMI S,DARRELL T.Variational adversarial active learning[C]//Proceedings of the IEEE/CVF InternationalConference on Computer Vision.2019:5972-5981. [25]WU D.Pool-based sequential active learning for regression[J].IEEE Transactions on Neural Networks and Learning Systems,2018,30(5):1348-1359. [26]BELUCH W H,GENEWEIN T,NüRNBERGER A,et al.The power of ensembles for active learning in image classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:9368-9377. [27]WAHONO R S.A systematic literature review of software defect prediction[J].Journal of Software Engineering,2015,1(1):1-16. [28]D’AMBROS M,LANZA M,ROBBES R.On the relationshipbetween change coupling and software defects[C]//2009 16th Working Conference on Reverse Engineering.IEEE,2009:135-144. [29]MIZUNO O,IKAMI S,NAKAICHI S,et al.Spam filter based approach for finding fault-prone software modules[C]//Fourth International Workshop on Mining Software Repositories(MSR’07:ICSE Workshops 2007).IEEE,2007. [30]GROSSMAN M R,CORMACK G V,ROEGIEST A.TREC2016 Total Recall Track Overview[C]//TREC.2016. [31]YU Z,THEISEN C,WILLIAMS L,et al.Improving vulnera-bility inspection efficiency using active learning[J].IEEE Tran-sactions on Software Engineering,2019,47(11),2401-2420. [32]RAMOS J.Using tf-idf to determine word relevance in document queries[C]//Proceedings of the First Instructional Conference on Machine Learning.2003:29-48. [33]LI H,CHUNG F,WANG S.A SVM based classification method for homogeneous data[J].Applied Soft Computing,2015,36:228-235. [34]WEI H,HU C,CHEN S,et al.Establishing a software defectprediction model via effective dimension reduction[J].Information Sciences,2019,477:399-409. [35]WANG K,LIU L,YUAN C,et al.Software defect predictionmodel based on LASSO-SVM[J].Neural Computing and Applications,2021,33(14):8249-8259. [36]JUST R,JALALI D,ERNST M D.Defects4J:A database ofexisting faults to enable controlled testing studies for Java programs[C]//Proceedings of the 2014 International Symposium on Software Testing and Analysis.2014:437-440. [37]PACHECO C,ERNST M D.Randoop:feedback-directed ran-dom testing for Java[C]//Companion to the 22nd ACM SIGPLAN Conference on Object-oriented Programming Systems and Applications Companion.2007:815-816. [38]FRASER G,ARCURI A.Evosuite:automatic test suite generation for object-oriented software[C]//Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering.2011:416-419. [39]VIRGÍNIO T,MARTINS L A,SOARES L R,et al.An empirical study of automatically-generated tests from the perspective of test smells[C]//Proceedings of the 34th Brazilian Symposium on Software Engineering.2020:92-96. |
[1] | 汪文轩, 胡军, 胡建成, 康介祥, 王辉, 高忠杰. 一种面向形式化表格需求模型的测试用例生成方法 Test Case Generation Method Oriented to Tabular Form Formal Requirement Model 计算机科学, 2021, 48(5): 16-24. https://doi.org/10.11896/jsjkx.201000048 |
[2] | 吉顺慧, 张鹏程. 基于支配关系的数据流测试用例生成方法 Test Case Generation Approach for Data Flow Based on Dominance Relations 计算机科学, 2020, 47(9): 40-46. https://doi.org/10.11896/jsjkx.200700021 |
[3] | 黄钊,黄曙光,邓兆琨,黄晖. 基于SEH的漏洞自动检测与测试用例生成 Automatic Vulnerability Detection and Test Cases Generation Method for Vulnerabilities Caused by SEH 计算机科学, 2019, 46(7): 133-138. https://doi.org/10.11896/j.issn.1002-137X.2019.07.021 |
[4] | 张娜,滕赛娜,吴彪,包晓安. 基于粒子群优化算法的测试用例生成方法 Test Case Generation Method Based on Particle Swarm Optimization Algorithm 计算机科学, 2019, 46(7): 146-150. https://doi.org/10.11896/j.issn.1002-137X.2019.07.023 |
[5] | 李志博,李清宝,于磊,侯雪梅. 基于划分的自适应随机测试综述 Survey on Adaptive Random Testing by Partitioning 计算机科学, 2019, 46(3): 19-29. https://doi.org/10.11896/j.issn.1002-137X.2019.03.003 |
[6] | 叶佳, 葛红军, 曹春, 朱晋, 张营. 规则驱动的Android应用DFS测试技术 Rule-driven DFS Testing Technology for Android Application 计算机科学, 2018, 45(9): 99-103. https://doi.org/10.11896/j.issn.1002-137X.2018.09.015 |
[7] | 包晓安, 熊子健, 张唯, 吴彪, 张娜. 一种基于改进遗传算法的路径测试用例生成方法 Approach for Path-oriented Test Cases Generation Based on Improved Genetic Algorithm 计算机科学, 2018, 45(8): 174-178. https://doi.org/10.11896/j.issn.1002-137X.2018.08.031 |
[8] | 杨红, 洪玫, 屈媛媛. 基于模型检测技术的变异测试用例生成方法 Approach of Mutation Test Case Generation Based on Model Checking 计算机科学, 2018, 45(11A): 488-493. |
[9] | 黄钰尧,李凤英,常亮,孟瑜. 基于符号零压缩二叉决策图的组合测试用例生成方法 Symbolic ZBDD-based Generation Algorithm for Combinatorial Testing 计算机科学, 2018, 45(1): 255-260. https://doi.org/10.11896/j.issn.1002-137X.2018.01.045 |
[10] | 陈洁琼,姜淑娟,张争光. 基于数据流准则的测试用例生成方法 Approach for Test Case Generation Based on Data Flow Criterion 计算机科学, 2017, 44(2): 107-111. https://doi.org/10.11896/j.issn.1002-137X.2017.02.015 |
[11] | 张雄,李舟军. 模糊测试技术研究综述 Survey of Fuzz Testing Technology 计算机科学, 2016, 43(5): 1-8. https://doi.org/10.11896/j.issn.1002-137X.2016.05.001 |
[12] | 张卫祥,刘文红. 基于故障树分析与组合测试的测试用例生成方法 Test Suite Generation Based on Interaction Testing and Fault Tree Analysis 计算机科学, 2014, 41(Z11): 375-378. |
[13] | 邬晟峰, 吴悦, 徐拾义. 准完全最大距离伪随机测试研究 Study on Quasi-perfect Maximum Distance Pseudo Random Testing 计算机科学, 2014, 41(5): 50-54. https://doi.org/10.11896/j.issn.1002-137X.2014.05.011 |
[14] | 王蓁蓁. 软件测试理论初步框架 Elementary Theoretical Framework for Software Testing 计算机科学, 2014, 41(3): 12-16. |
[15] | 侯超凡,吴际,刘超. 基于测试需求的互操作性测试用例生成方法 Interoperability Test Case Generation Based on Testing Requirements 计算机科学, 2014, 41(11): 162-168. https://doi.org/10.11896/j.issn.1002-137X.2014.11.032 |
|