计算机科学 ›› 2024, Vol. 51 ›› Issue (6A): 230600088-8.doi: 10.11896/jsjkx.230600088
王昭丹, 邹卫琴, 刘文杰
WANG Zhaodan, ZOU Weiqin, LIU Wenjie
摘要: 缺陷定位是缺陷修复的关键步骤,同时也是一项繁琐的软件活动。现有的静态缺陷定位技术通常将缺陷定位视为一个检索任务,即为每个缺陷报告生成一份按照程序实体与缺陷相关度降序排列的可疑文件推荐列表。然而,开发人员仍需人工一一审查从而找到真正有缺陷的文件,这增加了定位的时间和成本。为解决这个问题,提出了一个相应的解决方案。首先运行主流的基于信息检索的静态缺陷定位技术来获得一个初始的可疑文件推荐列表;然后依据问题特性提出3类领域特征,并基于这3类特征构建一个机器学习模型,尝试从列表中识别出真正有缺陷(Truly Buggy)的源代码文件。在4个开源项目(Zoo-Keeper,OpenJPA,Tomcat,AspectJ)的2558个bug上进行了实验,结果表明,在最初可疑文件推荐列表上可以获得72.6%~80.7%的真正有缺陷的文件预测准确率。同时探究了3类特征子集及各个特征在预测真正有缺陷的文件上的重要性,发现缺陷报告与源代码的关系特征更重要。
中图分类号:
[1]ZOU W,LO D,CHEN Z,et al.How practitioners perceive auto-mated bug report management techniques[J].IEEE Transactions on Software Engineering,2018,46(8):836-862. [2]ZHOU J,ZHANG H,LO D.Where should the bugs be fixed? more accurate information retrieval-based bug localization based on bug reports[C]//International Conference on Software Engineering.IEEE,2012:14-24. [3]RAHMAN M M,ROY C K.Improving ir-based bug localization with context-aware query reformulation[C]//Proceedings of the 2018 26th ACM Joint Meeting on European Software Enginee-ring Conference and Symposium on the Foundations of Software Engineering.2018:621-632. [4]YE X,BUNESCU R,LIU C.Learning to rank relevant files for bug reports using domain knowledge[C]//Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering.2014:689-699. [5]XUAN J,MONPERRUS M.Learning to combine multiple ran-king metrics for fault localization[C]//International Conference on Software Maintenance and Evolution.IEEE,2014:191-200. [6]ZHOU Z H.Ensemble methods:foundations and algorithms[M].CRC Press,2012. [7]ZIMMERMANN T,PREMRAJ R,BETTENBURG N,et al.What makes a good bug report?[J].IEEE Transactions on Software Engineering,2010,36(5):618-643. [8]OSTRAND T J,WEYUKER E J,BELL R M.Programmer-based fault prediction[C]//Proceedings of the 6th International Conference on Predictive Models in Software Engineering.2010:1-10. [9]POSNETT D,D’SOUZA R,DEVANBU P,et al.Dual ecological measures of focus in software development[C]//InternationalConference on Software Engineering.IEEE,2013:452-461. [10]DI NUCCI D,PALOMBA F,DE ROSA G,et al.A developercentered bug prediction model[J].IEEE Transactions on Software Engineering,2017,44(1):5-24. [11]JARMAN D,BERRY J,SMITH R,et al.Legion:Massivelycomposing rankers for improved bug localization at adobe[J].IEEE Transactions on Software Engineering,2021,48(8):3010-3024. [12]CHIDAMBER S R,KEMERER C F.A metrics suite for object oriented design[J].IEEE Transactions on Software Enginee-ring,1994,20(6):476-493. [13]BUSE R P L,WEIMER W R.Learning a metric for code reada-bility[J].IEEE Transactions on Software Engineering,2009,36(4):546-558. [14]MILLER G A.WordNet:a lexical database for English[J].Communications of the ACM,1995,38(11):39-41. [15]BAO L,XING Z,XIA X,et al.Who will leave the company?A large-scale industry study of developer turnover by mining monthly work report[C]//2017 IEEE/ACM 14th International Conference on Mining Software Repositories.IEEE,2017:170-181. [16]TIAN Y,NAGAPPAN M,LO D,et al.What are the characte-ristics of high-rated apps?A case study on free android applications[C]//International conference on software maintenance and evolution.IEEE,2015:301-310. [17]CHAKKRIT T.The Scott-Knott Effect Size Difference(ESD) Test[EB/OL].(2018-05-08).https://cran.r-project.org/web/packages/ScottKnottESD/ScottKnottESD.pdf. [18]WOLPERT D H,MACREADY W G.An efficient method to estimate bagging’s generalization error[J].Machine Learning,1999,35:41-55. [19]ABDI H.Bonferroni and idák corrections for multiple comparisons[J].Encyclopedia of Measurement and Statistics,2007,3(1):2007. [20]SALTON G,MCGILL M.Introduction to modern informationretrieval[M].McGraw-Hill,1983. [21]GAY G,HAIDUC S,MARCUS A,et al.On the use of relevance feedback in IR-based concept location[C]//International Conference on Software Maintenance.IEEE,2009:351-360. [22]WONG C P,XIONG Y,ZHANG H,et al.Boosting bug-report-oriented fault localization with segmentation and stack-trace analysis[C]//International Conference on Software Maintenance and Evolution.IEEE,2014:181-190. [23]RAHMAN S,GANGULY K K,SAKIB K.An improved buglocalization using structured information retrieval and version history[C]//International Conference on Computer and Information Technology.IEEE,2015:190-195. [24]YOUM K C,AHN J,LEE E.Improved bug localization based on code change histories and bug reports[J].Information and Software Technology,2017,82:177-192. [25]DEERWESTER S,DUMAIS S T,FURNAS G W,et al.Indexing by latent semantic analysis[J].Journal of the American Society for Information Science,1990,41(6):391-407. [26]BLEI D M,NG A Y,JORDAN M I.Latent dirichlet allocation[J].Journal of Machine Learning Research,2003,3(Jan):993-1022. [27]MORENO L,TREADWAY J J,MARCUS A,et al.On the use of stack traces to improve text retrieval-based bug localization[C]//International Conference on Software Maintenance and Evolution.IEEE,2014:151-160. [28]WANG S,LO D.Amalgam+:Composing rich informationsources for accurate bug localization[J].Journal of Software:Evolution and Process,2016,28(10):921-942. [29]SISMAN B,KAK A C.Assisting code search with automatic query reformulation for bug localization[C]//2013 10th Wor-king Conference on Mining Software Repositories.IEEE,2013:309-318. [30]RAHMAN M M,ROY C.Poster:improving bug localizationwith report quality dynamics and query reformulation[C]//International Conference on Software Engineering:Companion.IEEE,2018:348-349. [31]KIM M,LEE E.A novel approach to automatic query reformulation for ir-based bug localization[C]//Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing.2019:1752-1759. [32]LAM A N,NGUYEN A T,NGUYEN H A,et al.Bug localization with combination of deep learning and information retrieval[C]//International Conference on Program Comprehension.IEEE,2017:218-229. [33]XIAO Y,KEUNG J,BENNIN K E,et al.Improving bug localization with word embedding and enhanced convolutional neural networks[J].Information and Software Technology,2019,105:17-29. [34]CAO J,YANG S,JIANG W,et al.Bugpecker:Locating faulty methods with deep learning on revision graphs[C]//Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering.2020:1214-1218. [35]HUO X,THUNG F,LI M,et al.Deep transfer bug localization[J].IEEE Transactions on Software Engineering,2019,47(7):1368-1380. [36]MENG X,WANG X,ZHANG H,et al.Improving fault localization and program repair with deep semantic features and transferred knowledge[C]//Proceedings of the 44th International Conference on Software Engineering.2022:1169-1180. [37]LIANG H,HANG D,LI X.Modeling function-level interactions for file-level bug localization[J].Empirical Software Enginee-ring,2022,27(7):1051-1076. [38]YOUSOFVAND L,SOLEIMANI S,RAFE V.Automatic bug localization using a combination of deep learning and model transformation through node classification[J].Software Quality Journal,2023,31(4):1045-1063. |
|