计算机科学 ›› 2025, Vol. 52 ›› Issue (1): 242-249.doi: 10.11896/jsjkx.240200046
朱晓燕, 王文格, 王嘉寅, 张选平
ZHU Xiaoyan, WANG Wenge, WANG Jiayin, ZHANG Xuanping
摘要: 即时软件缺陷预测指在软件更改初次提交之际预测该更改引入缺陷的倾向。此类预测针对单一程序变更,而非在粗粒度上进行。由于其即时性和可追溯性,该技术已在持续测试等领域得到广泛应用。目前的研究中,提取变更代码表示的方法粒度较粗,仅标出了变更行,而没有进行细粒度的标记。此外,现有的使用提交内容进行缺陷预测的方法,仅仅是把提交消息与变更代码的特征进行简单拼接,缺失了在特征空间上的深度对齐,这使得在提交消息质量参差不齐的情况下,会出现预测结果易受噪声干扰的情形,并且现有方法也未将领域专家设计的人工特征以及变更内容中的语义语法信息综合起来进行预测。为了解决上述问题,提出了一种基于细粒度代码表征和特征融合的即时软件缺陷预测方法。通过引入新的变更嵌入计算方法来在细粒度上表示变更代码。同时,引入特征对齐模块,降低提交消息中噪声对方法性能的影响。此外,使用神经网络从人工设计的特征中学习专业知识,充分利用现有特征进行预测。实验结果表明,相较于现有方法,该方法在3个性能指标上均有显著提升。
中图分类号:
[1]WANG S,LIU T,NAM J,et al.Deep Semantic Feature Lear-ning for Software Defect Prediction [J].IEEE Transactions on Software Engineering,2020,46(12):1267-1293. [2]NUCCI D D,PALOMBA F,ROSA G D,et al.A Developer Centered Bug Prediction Model [J].IEEE Transactions on Software Engineering,2018,44(1):5-24. [3]SHAO Y,LIU B,WANG S,et al.A novel software defect prediction based on atomic class-association rule mining [J].Expert Systems with Applications,2018,114:237-254. [4]ASANO T,TSUNODA M,TODA K,et al.Using Bandit Algorithms for Project Selection in Cross-Project Defect Prediction [C]//Proceedings of the 2021 IEEE International Conference on Software Maintenance and Evolution(ICSME).IEEE,2021:649-653. [5]ZHAO Y,WANG Y,ZHANG D,et al.Eliminating the highfalse-positive rate in defect prediction through BayesNet with adjustable weight [J].Expert Systems,2022,39(6):e12977. [6]HATA H,MIZUNO O,KIKUNO T.Bug prediction based onfine-grained module histories [C]//Proceedings of the 2012 34th International Conference on Software Engineering(ICSE).Zurich,Switzerland:IEEE,2012:200-210. [7]HOANG T,DAM H K,KAMEI Y,et al.DeepJIT:an end-to-end deep learning framework for just-in-time defect prediction [C]//Proceedings of the 2019 IEEE/ACM 16th International Conference on Mining Software Repositories(MSR).Montreal,QC,Canada:IEEE,2019:34-45. [8]HOANG T,KANG H J,LO D,et al.Cc2vec:Distributed representations of code changes [C]//Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering.Seoul,South Korea:Association for Computing Machinery,2020:518-529. [9]ZHOU X,HAN D,LO D.Assessing Generalizability of Code-BERT [C]//Proceedings of the 2021 IEEE International Conference on Software Maintenance and Evolution(ICSME).Luxembourg:IEEE,2021:425-436. [10]D’AMBROS M,LANZA M,ROBBES R.Evaluating defect prediction approaches:a benchmark and an extensive comparison [J].Empirical Software Engineering,2012,17(4):531-577. [11]TURHAN B,MENZIES T,BENER A B,et al.On the relative value of cross-company and within-company data for defect prediction [J].Empirical Software Engineering,2009,14(5):540-578. [12]ZHAO Y H,DAMEVSKI K,CHEN H.A Systematic Survey of Just-in-Time Software Defect Prediction[J].ACM Computing Surveys,2023,55(10):1.1-1.35. [13]KAMEI Y,SHIHAB E,ADAMS B,et al.A large-scale empirical study of just-in-time quality assurance [J].IEEE Transactions on Software Engineering,2012,39(6):757-773. [14]SHIVAJI S,WHITEHEAD E J,AKELLA R,et al.Reducing features to improve code change-based bug prediction [J].IEEE Transactions on Software Engineering,2012,39(4):552-569. [15]RAJBAHADUR G,WANG S,KAMEI Y,et al.The Impact of Using Regression Modelsto Build Defect Classifiers [C]//Proceedings of the 2017 IEEE/ACM 14th International Conference on Mining Software Repositories(MSR).Buenos Aires,Argentina:IEEE,2017:135-145. [16]ZENG Z,ZHANG Y,ZHANG H,et al.Deep just-in-time defect prediction:How Far Are We? [C]//Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis.Virtual,Denmark:Association for Computing Machinery,2021:427-438. [17]MATSUMOTO S,KAMEI Y,MONDEN A,et al.An analysis of developer metrics for fault prediction [C]//Proceedings of the 6th International Conference on Predictive Models in Software Engineering.Timişoara,Romania:Association for Computing Machinery,2010:1-9. [18]JIANG T,TAN L,KIM S.Personalized defect prediction [C]//Proceedings of the 2013 28th IEEE/ACM International Confe-rence on Automated Software Engineering(ASE).Silicon Valley,CA,USA:IEEE,2013:279-289. [19]ZHAO K,XU Z,ZHANG T,et al.Simplified Deep Forest Model Based Just-in-Time Defect Prediction forAndroid Mobile Apps [J].IEEE Transactions on Reliability,2021,70(2):848-859. [20]CHEN X,ZHAO Y,WANG Q,et al.MULTI:Multi-objective effort-aware just-in-time software defect prediction [J].Information and Software Technology,2018,93:1-13. [21]KONDO M,GERMAN D M,MIZUNO O,et al.The impact of context metrics on just-in-time defect prediction [J].Empirical Software Engineering,2020,25(1):890-939. [22]YANG X,LO D,XIA X,et al.Deep Learning for Just-in-Time Defect Prediction [C]//Proceedings of the 2015 IEEE International Conference on Software Quality,Reliability and Security.Vancouver,BC,Canada:IEEE,2015:17-26. [23]QIAO L,WANG Y.Effort-aware and just-in-time defect prediction with neural network [J].PLoS One,2019,14(2):1-19. [24]ZHU K,YING S,ZHANG N,et al.Software defect prediction based on enhanced metaheuristic feature selection optimization and a hybrid deep neural network [J].Journal of Systems and Software,2021,180:111026. [25]FENG Z,GUO D,TANG D,et al.CodeBERT:A Pre-TrainedModel for Programming and Natural Languages [C]//Procee-dings of the Association for Computational Linguistics:EMNLP 2020.Online:Association for Computational Linguistics,2020:1536-1547. [26]VASWANI A,SHAZEER N,PARMAR N,et al.Attention Is All You Need [J].Advances in Nutrition,2017,30:1-11. [27]HUANG T,ZHANG Z,ZHANG J.FiBiNET:combining feature importance and bilinear feature interaction for click-through rate prediction [C]//Proceedings of the 13th ACM Conference on Recommender Systems.Copenhagen,Denmark:Association for Computing Machinery,2019:169-177. [28]KAMEI Y,FUKUSHIMA T,MCINTOSH S,et al.Studyingjust-in-time defect prediction using cross-project models [J].Empirical Software Engineering,2016,21(5):2072-2106. [29]YANG X,LO D,XIA X,et al.TLEL:A two-layer ensemble learning approach for just-in-time defect prediction [J].Information and Software Technology,2017,87:206-220. [30]KESHAVARZ H,NAGAPPAN M.ApacheJIT:a large dataset for just-in-time defect prediction [C]//Proceedings of the 19th International Conference on Mining Software Repositories.Pittsburgh,Pennsylvania:Association for Computing Machinery,2022:191-195. [31]SPADINI D,ANICHE M,BACCHELLI A.PyDriller:Pythonframework for mining software repositories [C]//Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering.Lake Buena Vista,FL,USA:Association for Computing Machinery,2018:908-911. [32]WANG Y,WANG W,JOTY S,et al.CodeT5:Identifier-awareUnified Pre-trained Encoder-Decoder Models for Code Understanding and Generation [J].arXiv.2109.00859,2021. [33]LOSHCHILOV I,HUTTER F.Decoupled Weight Decay Regularization [C]//Proceedings ofthe 7th International Conference on Learning Representations.New Orleans,LA,USA:OpenReview.net,2019. [34]ZHOU X,HAN D,LO D.Simple or Complex? Together for a More Accurate Just-In-Time Defect Predictor [C]//Proceedings of the 2022 IEEE/ACM 30th International Conference on Program Comprehension(ICPC).Pittsburgh,PA,USA:IEEE,2022:229-240. [35]SAITO T,REHMSMEIER M.The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets [J].PLoS One,2015,10(3):1-21. [36]GARCIA H V,SHIHAB E.Characterizing and predicting blo-cking bugs in open source projects [C]//Proceedings of the 11th Working Conference on Mining Software Repositories.Hyderabad,India:Association for Computing Machinery,2014:72-81. |
|