计算机科学 ›› 2024, Vol. 51 ›› Issue (6): 68-77.doi: 10.11896/jsjkx.230400017
田帅华, 李征, 吴永豪, 刘勇
TIAN Shuaihua, LI Zheng, WU Yonghao, LIU Yong
摘要: 基于频谱的故障定位(Spectrum-Based Fault Localization,SBFL)技术已被广泛研究,可以帮助开发人员快速找到程序错误位置,以降低软件测试成本。然而,测试套件中存在一种特殊的测试用例,其执行了错误的语句但能输出符合预期的结果,这种测试用例被称为偶然正确(Coincidental Correct,CC)测试用例。CC测试用例会对SBFL技术的性能产生负面影响。为了减轻CC产生的负面影响,提升SBFL技术性能,文中提出了一种基于机器学习的CC测试用例识别方法(CC test cases Identification via Machine Learning,CCIML)。CCIML结合怀疑度公式特征和程序静态特征来识别CC测试用例,从而提高SBFL技术的故障定位精度。为了评估CCIML方法的性能,文中基于Defects4J数据集进行对比实验。实验结果表明,CCIML方法识别CC测试用例的平均召回率、准确率和F1分数分别为63.89%,70.16%和50.64%,该结果优于对比方法。除此之外,采用清洗和重标策略处理CCIML方法识别出的CC测试用例后,最终取得的故障定位效果也优于对比方法。其中,在清洗策略和重标策略下,错误语句怀疑度值排在第一位的数量分别为328和312,相比模糊加权K近邻(Fuzzy Weighted K-Nearest Neighbor,FW-KNN)方法,定位到的故障数量分别增长了124.66%,235.48%。
中图分类号:
[1]WONG W E,GAO R,LI Y,et al.A survey on software fault localization[J].IEEE Transactions on Software Engineering,2016,42(8):707-740. [2]ZHANG Z,LEI Y,MAO X,et al.A study of effectiveness of deep learning in locating real faults[J].Information and Software Technology,2021,131:106486. [3]PAPADAKIS M,LE TRAON Y.Metallaxis-FL:mutation-basedfault localization[J].Software Testing,Verification andReliabi-lity,2015,25(5/6/7):605-628. [4]WIDYASARI R,PRANA G A A,HARYONO S A,et al.XAI4FL:enhancing spectrum-based fault localization with explainable artificial intelligence[C]//Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension.2022:499-510. [5]ZHANG M,LI X,ZHANG L,et al.Boosting spectrum-basedfault localization using pagerank[C]//Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis.2017:261-272. [6]CAO H,LI L,CHU Y,et al.A coincidental correctness test case identification framework with fuzzy C-means clustering[J].Multimedia Systems,2023,29(3):1089-1101. [7]LIU Y,LI M,WU Y,et al.A weighted fuzzy classification approach to identify and manipulate coincidental correct test cases for fault localization[J].Journal of Systems and Software,2019,151:20-37. [8]LIU Y,LI Z,ZHAO R,et al.An optimal mutation executionstrategy for cost reduction of mutation-based fault localization[J].Information Sciences,2018,422:572-596. [9]LI X,TADIKONDA D N.Improving Mutation-Based Fault Localization via Mutant Categorization[C]//The 34th Interna-tional Conference on Software Engineering and Knowledge Engineering(SEKE).2022:166-171. [10]WONG W E,DEBROY V,GOLDEN R,et al.Effective software fault localization using an RBF neural network[J].IEEE Tran-sactions on Reliability,2011,61(1):149-169. [11]WEISER M.Programmers use slices when debugging[J].Communications of the ACM,1982,25(7):446-452. [12]YU J,LEI Y,XIE H,et al.Context-based cluster fault localization[C]//Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension.2022:482-493. [13]SABBAGHI A,KEYVANPOUR M R,PARSA S.FCCI:A fuzzyexpert system for identifying coincidental correct test cases[J].Journal of Systems and Software,2020,168:110635. [14]JIANG J,WANG R,XIONG Y,et al.Combining spectrum-based fault localization and statistical debugging:An empirical study[C]//2019 34th IEEE/ACM International Conference on Automated Software Engineering(ASE).IEEE,2019:502-514. [15]ZHANG Z,LEI Y,MAO X,et al.CNN-FL:An effective approach for localizing faults using convolutional neural networks[C]//2019 IEEE 26th International Conference on Software Analysis,Evolution and Reengineering(SANER).IEEE,2019:445-455. [16]LI Y,WANG S,NGUYEN T.Fault localization with code co-verage representation learning[C]//2021 IEEE/ACM 43rd International Conference on Software Engineering(ICSE).IEEE,2021:661-673. [17]KIM Y,MUN S,YOO S,et al.Precise learn-to-rank fault localization using dynamic and static features of target programs[J].ACM Transactions on Software Engineering and Methodology(TOSEM),2019,28(4):1-34. [18]WONG W E,DEBROY V,GAO R,et al.The DStar method for effective software fault localization[J].IEEE Transactions on Reliability,2013,63(1):290-308. [19]WEISHI L,MAO X.Alleviating the impact of coincidental correctness on the effectiveness of sfl by clustering test cases[C]//2014 Theoretical Aspects of Software Engineering Conference.IEEE,2014:66-69. [20]WU Y,LIU Y,WANG W,et al.Theoretical Analysis and Empirical Study on the Impact of Coincidental Correct Test Cases in Multiple Fault Localization[J].IEEE Transactions on Reliability,2022,71(2):830-849. [21]ABOU ASSI R,MASRI W,TRAD C.How detrimental is coincidental correctness to coverage-based fault detection and localization? An empirical study[J].Software Testing,Verification and Reliability,2021,31(5):e1762. [22]DASS S,XUE X,NAMIN A S.Ensemble Random Forests Classifier for Detecting Coincidentally Correct Test Cases[C]//2020 IEEE 44th Annual Computers,Software,and Applications Conference(COMPSAC).IEEE,2020:1326-1331. [23]MENZIES T,MILTON Z,TURHAN B,et al.Defect prediction from static code features:current results,limitations,new approaches[J].Automated Software Engineering,2010,17:375-407. [24]LI X,LI W,ZHANG Y,et al.Deepfl:Integrating multiple fault diagnosis dimensions for deep fault localization[C]//Procee-dings of the 28th ACM SIGSOFT International Symposium on Software Testing And analysis.2019:169-180. [25]ADETUNJI A B,AKANDE O N,AJALA F A,et al.House price prediction using random forest machine learning technique[J].Procedia Computer Science,2022,199:806-813. [26]ZHANG J,WANG Z,ZHANG L,et al.Predictive mutation testing[C]//Proceedings of the 25th International Symposium on Software Testing and Analysis.2016:342-353. [27]STEFANIDOU-VOZIKI P,CARDONER-VALBUENA D,VILLAFAFILA-ROBLES R,et al.Data analysis and management for optimal application of an advanced ML-based fault location algorithm for low voltage grids[J].International Journal of Electrical Power & Energy Systems,2022,142:108303. [28]XIE H,LEI Y,YAN M,et al.A universal data augmentation approach for fault localization[C]//Proceedings of the 44th International Conference on Software Engineering.2022:48-60. [29]GAO X,DENG F,YUE X.Data augmentation in fault diagnosis based on the Wasserstein generative adversarial network with gradient penalty[J].Neurocomputing,2020,396:487-494. [30]ZHANG Z,LEI Y,MAO X,et al.Improving Fault Localization Using Model-domain Synthesized Failing Test Generation[C]//2022 IEEE International Conference on Software Maintenance and Evolution(ICSME).IEEE,2022:199-210. [31]ABOU ASSI R,TRAD C,MAALOUF M,et al.Coincidentalcorrectness in the Defects4J benchmark[J].Software Testing,Verification and Reliability,2019,29(3):e1696. [32]HU J,XIE H,LEI Y,et al.A light-weight data augmentation method for fault localization[J].Information and Software Technology,2023:107148. [33]LI Z,WU Y,LIU Y.An empirical study of bug isolation on the effectiveness of multiple fault localization[C]//2019 IEEE 19th International Conference on Software Quality,Reliability and Security(QRS).IEEE,2019:18-25. [34]LOU Y,ZHU Q,DONG J,et al.Boosting coverage-based fault localization via graph-based representation learning[C]//Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering.2021:664-676. [35]KIM J,KIM J,LEE E.Vfl:Variable-based fault localization[J].Information and Software Technology,2019,107:179-191. [36]ZENG M,WU Y,YE Z,et al.Fault localization via efficientprobabilistic modeling of program semantics[C]//Proceedings of the 44th International Conference on Software Engineering.2022:958-969. [37]ZAKARI A,LEE S P,ABREU R,et al.Multiple fault localization of software programs:A systematic literature review[J].Information and Software Technology,2020,124:106312. [38]KOCHHAR P S,XIA X,LO D,et al.Practitioners’ expecta-tions on automated fault localization[C]//Proceedings of the 25th International Symposium on Software Testing and Analysis.2016:165-176. [39]PEDREGOSA F,VAROQUAUX G,GRAMFORT A,et al.Scikit-learn:Machine learning in Python[J].Journal of Machine Learning Research,2011,12:2825-2830. |
|