计算机科学 ›› 2024, Vol. 51 ›› Issue (7): 1-9.doi: 10.11896/jsjkx.230400069
• 计算机软件 • 下一篇
刘文杰, 邹卫琴, 蔡碧瑜, 陈冰婷
LIU Wenjie, ZOU Weiqin, CAI Biyu, CHEN Bingting
摘要: 为了加快开发人员定位软件缺陷,研究人员提出了一系列基于文本检索的缺陷定位技术,自动为用户所提交的缺陷报告推荐可疑的代码文件。由于用户的专业知识不同,编写的缺陷报告质量不一致,因此某些低质量的缺陷报告无法被成功定位。对低质量的缺陷报告进行重构从而改进其定位效果,是常见的解决方案。现有基于查询扩展和查询缩减的主流重构方法,容易出现重构前后查询主题不一致或所依赖伪相关库质量差导致重构质量低的问题。对此,提出了一种基于主题一致性保持和伪相关反馈库扩展的缺陷报告重构方法,由主题一致性保持的查询缩减阶段和伪相关反馈库扩展的查询扩展阶段两部分组成。查询缩减阶段将缺陷报告的概要问题描述和从问题描述文本中提取的关键词合并来解决主题不一致性问题;查询扩展阶段综合使用多种定位工具(即 Lucene,BugLocator 和 Blizzard)来获得伪相关反馈库,并从中提取查询扩展关键词,以解决现有伪相关反馈库质量差导致的重构质量低的问题;最后将查询缩减和扩展阶段的输出合并得到重构后的查询。通过在6个 Java 项目上进行实验发现,对于使用现有缺陷定位方法无法在TOP 10可疑推荐文件中定位的低质量缺陷报告,使用所提重构方法后,能定位其中21%~39%的缺陷即Accuracy@10,MRR@10为 10%~16%。对比现有重构技术,所提重构方法在Accuracy@10和MRR@10 两个指标上分别可以提升7%~32%和2%~13%。
中图分类号:
[1]ZHOU J,ZHANG H,LO D.Where should the bugs be fixed? more accurate information retrieval-based bug localization based on bug reports[C]//International Conference on Software Engineering.2012:14-24. [2]WONG C P,XIONG Y,ZHANG H,et al.Boosting bug-report-oriented fault localization with segmentation and stack-trace analysis[C]//International Conference on Software Maintenance and Evolution.2014:181-190. [3]WANG S,LO D.Version history,similar report,and structure:Putting them together for improved bug localization[C]//International Conference on Program Comprehension.2014:53-63. [4]KEVIC K,FRITZ T.Automatic Search Term Identification for Change Tasks[C]//International Conference on Software Engineering.2014:468-471. [5]RAHMAN M M,ROY C K.STRICT:Information retrievalbased search term identification for concept location[C]//International Conference on Software Analysis.Evolution & Reengineering,2017:79-90. [6]ROCCHIO J J.The SMART Retrieval System-Experiments in Automatic Document Processing[C]//IEEE Transactions on Professional Communication.1972:17-17. [7]CARPINETO C,ROMANO G.A Survey of Automatic QueryExpansion in Information Retrieval[J].ACM Computing Surveys,2012,14(1):1:50. [8]RAHMAN M M,ROY C K.Improved query reformulation for concept location using coderank and document structures[C]//International Conference on Automated Software Engineering.2017:428-439. [9]CHAPARRO O,FLOREZ J M,MARCUS A.Using bug de-scriptions to reformulate queries duringtext-retrieval-based bug localization[J].Empirical Software Engineering,2019,25(4):2947-3007. [10]HOWARD M J,GUPTA S,POLLOCK L,et al.Automatically mining software-based,semantically-similar words from comment-code mappings[C]//Working Conference on Mining Software Repositories.2013:377-386. [11]TIAN Y,LO D,LAWALL J.Automated construction of a soft-ware-specific word similarity database[C]//Software Evolution Week-IEEE Conference on Software Maintenance,Reenginee-ring,and Reverse Engineering.2014:44-53. [12]CAO K,CHEN C,BALTES S,et al.Automated query reformulation for efficientsearch based on query logs from stack overflow[C]//International Conference on Software Engineering.2021:1273-1285. [13]LEMOS O A L,PAULA A C,ZANICHELLI F C,et al.Thesaurus-based Automatic Query Expansion for Interface-driven Code Search[C]//Working Conference on Mining Software Repositories.2014:212-221. [14]HILL E,POLLOCK L,VIJAY-SHANKER K.AutomaticallyCapturing Source Code Context of NL-queries for Software Maintenance and Reuse[C]//International Conference on Software Engineering.2009:232-242. [15]SISMAN B,KAK A C.Assisting Code Search with Automatic Query Reformulation for Bug Localization[C]//Working Conference on Mining Software Repositories.2013:309-318. [16]RAHMAN M M,ROY C K.Improving IR-based bug localization with context-aware query reformulation[C]//Joint Meeting on European Software Engineering Conference & Symposium on the Foundations of Software Engineering.2018:621-632. [17]WONG C P,XIONG Y,ZHANG H,et al.Boosting bug-report-oriented fault localization with segmentation and stack-trace analysis[C]//International Conference on Software Maintenance and Evolution.2014:181-190. [18]DEERWESTER S,DUMAIS S T,FURNAS G W,et al.Inde-xing by latent semantic analysis[J].Journal of the American Society for Information Science,1990,41(6):391-407. [19]NGUYEN A T,NGUYEN T T,AL-KOFAHI J,et al.A topic-based approach for narrowing the search space of buggy files from a bug report[C]//International Conference on Automated Software Engineering.2011:263-272. [20]THOMAS S W,NAGAPPAN M,BLOSTEIN D,et al.The impact of classifier configuration and classifier combination on buglocalization[J].IEEE Transactions on Software Engineering,2013,39(10):1427-1443. [21]LAM A N,NGUYEN A T,NGUYEN H A,et al.Bug localiza-tion with combination of deep learning and information retrieval[C]//International Conference on Program Comprehension.IEEE,2017:218-229. [22]XIAO Y,KEUNG J.Improving bug localization with character-level convolutional neural network and recurrent neural network[C]//Software Engineering Conference.IEEE,2018:703-704. [23]QIU F,GAO Z,XIA X,et al.Deep just-in-time defect localization[J].IEEE Transactions on Software Engineering,2021,48(12):5068-5086. [24]DIT B,GUERROUJ L,POSHYVANYK D,et al.Can betteridentifier splitting techniques help feature location? [C]//International Conference on Program Comprehension.2011:11-20. [25]MIHALCEA R,TARAU P.Textrank:Bringing order into text [C]//Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing.2004:404-411. [26]BRIN S,PAGE L.The anatomy of a large-scale hypertextualweb search engine [J].Computer Networks and ISDN Systems,1998,30(1/2/3/4/5/6/7):107-117. [27]BLANCO R,LIOMA C.Graph-based term weighting for information retrieval [J].Information Retrieval,2012,15:54-92. [28]ZOU D,LIANG J,XIONG Y,et al.An empirical study of fault localization families and their combinations[J].IEEE Transactions on Software Engineering,2019,47(2):332-347. [29]MORENO L,TREADWAY J J,Marcus A,et al.On the use of stack traces to improve text retrieval-based bug localization [C]//International Conference on Software Maintenance and Evolution.2014:151-160. [30]YE X,BUNESCU R,LIU C.Learning to rank relevant files for bug reports using domain knowledge[C]//Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering.2014:689-699. |
|