计算机科学 ›› 2017, Vol. 44 ›› Issue (4): 90-95.doi: 10.11896/j.issn.1002-137X.2017.04.020

• NASAC 2015 • 上一篇    下一篇

面向单个文件的个性化缺陷预测方法

陈恒,刘文广,高东静,彭鑫,赵文耘   

  1. 复旦大学软件学院 上海201203上海市数据科学重点实验室复旦大学 上海201203,复旦大学软件学院 上海201203上海市数据科学重点实验室复旦大学 上海201203,复旦大学软件学院 上海201203上海市数据科学重点实验室复旦大学 上海201203,复旦大学软件学院 上海201203上海市数据科学重点实验室复旦大学 上海201203,复旦大学软件学院 上海201203上海市数据科学重点实验室复旦大学 上海201203
  • 出版日期:2018-11-13 发布日期:2018-11-13
  • 基金资助:
    本文受国家自然科学基金(61370079),国家高技术研究发展计划(863)(2013AA01A605)资助

Personalized Defect Prediction for Individual Source Files

CHEN Heng, LIU Wen-guang, GAO Dong-jing, PENG Xin and ZHAO Wen-yun   

  • Online:2018-11-13 Published:2018-11-13

摘要: 现有的缺陷预测方法大多数是面向项目或个人的,这些方法或没有区分文件之间和开发人员之间的差异性,或只区分了开发人员的差异性。然而,在软件开发中,开发人员之间和代码文件之间的差异性是同时存在的,而且这些差异性都可能会影响缺陷建模或预测的结果。因此,如果缺陷预测方法忽视这些差异性或忽视其中任意一种,针对整个项目或某个开发人员建立缺陷预测模型均可能会影响预测准确性。针对此问题,提出了一种面向单个文件的个性化缺陷预测方法,即将每个开发人员修改每个代码文件的历史数据都作为单独的数据集,建立对应的缺陷模型,并将之用来预测对应开发人员修改对应文件的缺陷情况。通过实验初步确认了在单个文件的个人缺陷数据充分的情况下该方法能够有效地提高缺陷预测的准确性。

关键词: 缺陷预测,个性化,源代码,提交

Abstract: Most defect prediction methods are project-oriented or developer-oriented.These methods do not distinguish the differences between source files and between developers,or only distinguish the differences between developers.However,there exist differences between developers and between source files in development process,and both diffe-rences may have an effect on defect prediction model and result.Therefore,if these two kinds of differences or either one kind of differences are ignored,and defect prediction models based on either the whole project or a single developer’s development history are built,the prediction accuracy may be affected.To solve this problem,this paper proposed a personalized defect prediction method for individual source files,that is,we regarded the historical data of each file modified by each developer as an independent dataset,built a corresponding defect prediction model and used it to predict the defect possibility for the corresponding file modified by the corresponding developer.Experiments show the proposed method can improve the prediction accuracy with sufficient personal defect data.

Key words: Defect prediction,Personalization,Source file,Committing

[1] TIAN J,LIN T,KIM S.Personalized defect prediction[C]∥2013 IEEE/ACM International Conference on Automated Software Engineering.IEEE Computer Society,2013:279-289
[2] WANG Q,WU S J,LI M S.Software Defect Prediction[J].Journal of Software,2008,9(7):1565-1580.(in Chinese) 王青,伍书剑,李明树.软件缺陷预测技术[J].软件学报,2008,9(7):1565-1580.
[3] YUAN Z,YU L L,LIU C.Bug prediction method for fine-grained source code changes[J].Journal of Software,2014,5(11):2499-2517.(in Chinese) 原子,于莉莉,刘超.面向细粒度源代码变更的缺陷预测方法[J].软件学报,2014,5(11):2499-2517.
[4] PREMRAJ R,HERZIG K.Network Versus Code Metrics toPredict Defects:A Replication Study[C]∥Proceedings of the 2011 International Symposium on Empirical Software Enginee-ring and Measurement.IEEE Computer Society,2011:215-224
[5] CHULANI S.Constructive Quality Modeling for Defect Density Prediction:COQUALMO[R].Center for Software Engineering,IBM Research,1999.
[6] LEI T.Software Defect prevention based on defect classification and defect prediction[J].Computer Engineering and Design,2013,4(1):215-220.(in Chinese) 雷挺.基于缺陷分类和缺陷预测的软件缺陷预防[J].计算机工程与设计,2013,4(1):215-220.
[7] PAN S,TAN X,PENG X,et al.Improving Software Defect Prediction by Combining the Information of Class and Package[J].Journal of Frontiers of Computer Science and Technology,2012,6(2):109-117.(in Chinese) 潘森,谭曦,彭鑫,等.综合包级和类级度量的软件缺陷预测方法[J].计算机科学与探索,2012,6(2):109-117.
[8] TU Y M,MAO J P,YU J,et al.Analysis of Software Defect Prediction Model of System Testing Process[J].Journal of Computer Research and Development,2010,7(Suppl.):108-112.(in Chinese) 涂亚明,毛军鹏,余静,等.系统测试阶段的软件缺陷预测模型分析[J].计算机研究与发展,2010,7(Suppl.):108-112.
[9] WANG P,JIN C,GE H H.Mutual information-based feature selection approach for software defect prediction[J].Journal of Computer Applications,2012,2(6):1738-1740.(in Chinese) 王培,金聪,葛贺贺.面向软件缺陷预测的互信息属性选择方法[J].计算机应用,2012,2(6):1738-1740.
[10] SHIVAJI S,WHITEHEAD E J,AKELLA R,et al.Reducing Features to Improve Bug Prediction[C]∥2009 IEEE/ACM International Conference on Automated Software Engineering.IEEE Computer Society,2009:600-604
[11] XU G C,LIU X Z,HU L,et al.Software reliability assessment models incorporating software defect correlation[J].Journal of Software,2011,2(3):439-450.(in Chinese) 徐高潮,刘新忠,胡亮,等.引入关联缺陷的软件可靠性评估模型[J].软件学报,2011,2(3):439-450
[12] JIANG H Y,ZONG M,LIU X Y.Research of Software Defect Prediction Model Based on ACO-SVM[J].Chinese Journal of Computers,2011,4(6):1148-1154.(in Chinese) 姜慧研,宗茂,刘相莹.基于ACO-SVM的软件缺陷预测模型的研究[J].计算机学报,2011,4(6):1148-1154
[13] KIM S,WHITEHEAD E J,ZHANG Y.Classifying Software Changes:Clean or Buggy?[J].IEEE Transactions Software Engineering,2008,4(2):181-196.
[14] SLIWERSKI J,ZIMMERMANN T,ZELLER A.When do changes induce fixes?[J].ACM Sigsoft Software Engineering Notes,2005,0(1):1-5.
[15] SEBASTIANI F.Machine learning in automated text categorization[J].ACM Computing Surveys,2002,4(2):1-47
[16] OSTRAND T J,WEYUKER E J,B ELL R M.Where the bugs are[J].AcmSigsoft Software Engineering Notes,2004,9(4):86-96.
[17] SCOTT S,MATWIN S.Feature Engineering for Text Classification[C]∥International Conference on ICML.1999:379--388.
[18] MOCKUS A,VOTTA L G.Identifying Reasons for Software Changes Using Historic Databases[C]∥Proceedings of IEEE International Conference on Software Maintenance.2000:120-130
[19] CUBRANIC D,MURPHY G C.Hipikat:recommending perti-nent software development artifacts[C]∥Proceedings of 25th International Conference on Software Engineering.2003:408-418
[20] ZIMMERMANN T,NAGAPPAN N,GALL H,et al.Cross-project Defect Prediction A Large Scale Experiment on Data vs.Domain vs.Process[C]∥European Software Engineering Conference.2009:91-100.
[21] NAM J,PAN S J,KIM S.Transfer defect learning[C]∥International Conference on Software Engineering.2013:382-339.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!