计算机科学 ›› 2022, Vol. 49 ›› Issue (12): 89-98.doi: 10.11896/jsjkx.220200181
王子元, 卜德欣, 李凌菱, 张霞
WANG Zi-yuan, BU De-xin, LI Ling-ling, ZHANG Xia
摘要: R语言提供了多种统计计算的功能,并被认为是最适合人工智能领域的程序设计语言之一。语言功能的正确实现是R语言程序正确运行的必要前提,但R语言中不可避免地存在着诸多软件缺陷。文中对R语言及其核心包中的历史缺陷进行了实证研究。通过分析R语言及其核心包中的7 020个缺陷报告发现:1)缺陷所涉及的35个R语言版本中R 3.1.2,R 3.0.2,R 3.5.0所含缺陷的数量较多,这些缺陷大量分布于Documentation,Graphics,Language等少数组件中;2)缺陷优先级整体较高的组件依次是Startup,Installation和Analyses,缺陷严重程度整体较高的组件依次是I/O,Installation和Accuracy,缺陷的优先级和严重性之间存在中等强度的秩相关;3)约78%的缺陷可在一年之内被修复;4)语义错误是缺陷最常见的根本原因,其中缺少功能和数据处理错误在各个阶段均占有较高的比例。这些发现揭示了R语言及其核心包中历史缺陷的一些基本规律,可在一定程度上帮助R语言开发人员提高开发质量,帮助R语言维护人员更高效地检测和修复缺陷,并帮助R语言的使用者规避潜在风险。
中图分类号:
[1]TAN L,LIU C,LI Z M,et al.BugCharacteristics in Open Source Software [J].Empirical Software Engineering,2014,19(6):1665-1705. [2]WAN Z Y,LO D,XIA X,et al.Bug Characteristics in Blockchain Systems:A Large-Scale Empirical Study [C]//Procee-dings of the IEEE/ACM 14th International Conference on Mi-ning Software Repositories(MSR 2017).2017:413-424. [3]RAZZAQ S,LI Y F,LIN C T,et al.A Study of the Extraction of Bug Judgment and Correction Times from Open Source Software Bug Logs [C]//Proceedings of the IEEE International Conference on Software Quality,Reliability and Security Companion(QRS-C 2018).2018:229-234. [4]BHATTACHARYA P,ULANOVA L,NEAMTIU I,et al.An Empirical Analysis of Bug Reports and Bug Fixing in Open Source Android Apps [C]//Proceedings of the 17th European Conference on Software Maintenance and Reengineering.2013:133-143. [5]SAHA R K,KHURSHID S,PERRY D E.An Empirical Study of Long Lived Bugs [C]//Proceedings of the Software Evolution Week-IEEE Conference on Software Maintenance,Reenginee-ring,and Reverse Enginee-ring(CSMR-WCRE 2014).2014:144-153. [6]YUE R R,MENG N,WANG Q X.A Characterization Study of Repeated Bug Fixes [C]//Proceedings of the IEEE Interna-tional Conference on Software Maintenance and Evolution(ICSME 2017).2017:422-432. [7]ZIMMERMANN T,NAGAPPAN N,GUO P J,et al.Characte-rizing and Predicting Which Bugs Get Reopened [C]//Procee-dings of the 34th International Conference on Software Enginee-ring(ICSE 2012).2012:1074-1083. [8]SUN C N,DU J,CHEN N,et al.Mining Explicit Rules for Software Process Evaluation [C]//Proceedings of the International Conference on Software and System Process(ICSSP 2013).2013:118-125. [9]CHEN N,HOI S C H,XIAO X K.Software Process Evaluation:A Machine Learning Approach [C]//Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering(ASE 2011).2011:333-342. [10]SUN C N,LE V,ZHANG Q R,et al.Toward UnderstandingCompiler Bugs in GCC and LLVM [C]//Proceedings of the 25th International Symposium on Software Testing and Analysis(ISSTA 2016).2016:294-305. [11]SAHOO S K,CRISWELL J,ADVE V.An Empirical Study of Reported Bugs in Server Software with Implications for Automated Bug Diagnosis [C]//Proceedings of the ACM/IEEE 32nd International Conference on Software Engineering(ICSE 2010).2010:1-10. [12]LE V,SUN C N,SU Z D.Finding Deep Compiler Bugs via Guided Stochastic Program Mutation [C]//Proceedings of the ACM SIGPLAN International Conference on Object-Oriented Programming,Systems,Languages,and Applications(OOPSLA 2015).2015:386-399. [13]ZAMAN S,ADAMS B,E.HASSAN A.Security Versus Performance Bugs:A Case Study on Firefox [C]//Proceedings of the 8th Working Conference on Mining Software Repositories(MSR 2011).2011:93-102. [14]VUIJAYAKUMAR K,BHUVANESWARI V.How Much Effort Needed to Fix the Bug? A Data Mining Approach for Effort Estimation and Analysing of Bug Report Attributes in Firefox [C]//Proceedings of the International Conference on Intelligent Computing Applications.2014:335-339. [15]LI F,PAXSON V.A Large-Scale Empirical Study of Security Patches [C]//Proceedings of the ACM Conference on Compu-ter and Communications Security(CCS 2017).2017:2201-2215. [16]HANAM Q,BRITO F S D M,MESBAH A.Discovering Bug Patterns in JavaScript [C]//Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering(FSE 2016).2016:144-156. [17]NGUYEN T,VU P M,NGUYE T T.An Empirical Study of Exception Handling Bugs and Fixes [C]//Proceedings of the ACM Southeast Conference(ACMSE 2019).2019:257-260. [18]SUN X B,ZHOU T C,LI G J,et al.An Empirical Study on Real Bugs for Machine Learning Programs [C]//Proceedings of the 24th Asia-Pacific Software Engineering Conference(APSEC 2017).2017:348-357. [19]ZHANG Y H,CHEN Y F,CHEUNG S C,et al.An Empirical Study on TensorFlow Program Bugs [C]//Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis(ISSTA 2018).2018:129-140. [20]ZHANG R,XIAO W C,ZHANG H Y,et al.An Empirical Stu-dy on Program Failures of Deep Learning Jobs [C]//Procee-dings of the 42nd International Conference on Software Enginee-ring(ICSE 2020).2020:1159-1170. [21]ISLAM M J,NGUYEN G,PAN R,et al.A ComprehensiveStudy on Deep Learning Bug Characteristics [C]//Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering(ESEC /FSE 2019).2019:510-520. [22]DU X T,XIAO G P,SUI Y L.Fault Triggers in the TensorFlow Framework:An Experience Report [C]//Proceedings ofthe IEEE 31st International Symposium on Software Reliability Engineering(ISSRE 2020).2020:1-12. [23]GRISHMA B R,ANJALI C.Software Root Cause Prediction using Clustering Techniques:A Review [C]//Proceedings of Global Conference on Communication Technologies(GCCT 2015).2015:511-515. [24]HIRSCH T,HOFER B.Root Cause Prediction Based on BugReports [C]//Proceedings of the 31st IEEE International Symposium on Software Reliability Engineering Workshops(ISSREW 2020).2020:171-176. [25]LAL H,HOFER B,PAHWA G.Root Cause Analysis of Software Bugs using Machine Learning Techniques [C]//Procee-dings of 7th International Conference on Cloud Computing,Data Science & Engineering-Confluence.2017:105-111. [26]JEFFREY D,GUPTA N,GUPTA R.Identifying the Root Causes of Memory Bugs Using Corrupted Memory Location Suppression [C]//Proceedings of the IEEE International Confe-rence on Software Maintenance(ICSM 2008).2008:356-369. [27]THUNG F,LO D,JIANG L X.Automatic Recovery of Root Causes from Bug-Fixing Changes [C]//Proceedings of 20th Working Conference on Reverse Engineering(WCRE 2013).2013:92-101. [28]DALAL S,CHHILLAR R S.Empirical Study of Root CauseAnalysis of Software Failure [J].ACM SIGSOFT Software Engineering Notes,2013,38(4):1-7. |
[1] | 倪珍, 李斌, 孙小兵, 李必信, 朱程. 面向软件缺陷报告的缺陷定位方法研究与进展 Research and Progress on Bug Report-oriented Bug Localization Techniques 计算机科学, 2022, 49(11): 8-23. https://doi.org/10.11896/jsjkx.220200117 |
[2] | 郑小萌, 高猛, 滕俊元. 航天器软件缺陷预测数据集构建方法研究 Research on Construction Method of Defect Prediction Dataset for Spacecraft Software 计算机科学, 2021, 48(6A): 575-580. https://doi.org/10.11896/jsjkx.200900133 |
[3] | 滕俊元, 高猛, 郑小萌, 江云松. 噪声可容忍的软件缺陷预测特征选择方法 Noise Tolerable Feature Selection Method for Software Defect Prediction 计算机科学, 2021, 48(12): 131-139. https://doi.org/10.11896/jsjkx.201000168 |
[4] | 胡腾, 王艳平, 张小松, 牛伟纳. 基于区块链的DApp数据与行为分析 Data and Behavior Analysis of Blockchain-based DApp 计算机科学, 2021, 48(11): 116-123. https://doi.org/10.11896/jsjkx.210200134 |
[5] | 陈正钊, 姜人和, 潘敏学, 张天, 李宣东. 基于约束求解的代码查询技术在StackOverflow上的实证研究 Empirical Study of Code Query Technique Based on Constraint Solving on StackOverflow 计算机科学, 2019, 46(11): 137-144. https://doi.org/10.11896/jsjkx.191100501C |
[6] | 邱少健, 蔡子仪, 陆璐. 基于卷积神经网络的代价敏感软件缺陷预测模型 Cost-sensitive Convolutional Neural Network Model for Software Defect Prediction 计算机科学, 2019, 46(11): 156-160. https://doi.org/10.11896/jsjkx.191100502C |
[7] | 胡梦园, 黄鸿云, 丁佐华. 用于软件缺陷预测的集成模型 Ensemble Model for Software Defect Prediction 计算机科学, 2019, 46(11): 176-180. https://doi.org/10.11896/jsjkx.180901685 |
[8] | 薛参观, 燕雪峰. 基于改进深度森林算法的软件缺陷预测 Software Defect Prediction Based on Improved Deep Forest Algorithm 计算机科学, 2018, 45(8): 160-165. https://doi.org/10.11896/j.issn.1002-137X.2018.08.029 |
[9] | 陈翔, 王秋萍. 基于代码修改的多目标有监督缺陷预测建模方法 Multi-objective Supervised Defect Prediction Modeling Method Based on Code Changes 计算机科学, 2018, 45(6): 161-165. https://doi.org/10.11896/j.issn.1002-137X.2018.06.028 |
[10] | 朱朝阳,陈相舟,闫龙,张信明. 基于主成分分析法的人工免疫识别软件缺陷预测模型研究 Research on Software Defect Prediction Based on AIRS Using PCA 计算机科学, 2017, 44(Z6): 483-485. https://doi.org/10.11896/j.issn.1002-137X.2017.6A.107 |
[11] | 杨杰,燕雪峰,张德平. 基于Boosting的代价敏感软件缺陷预测方法 Cost-sensitive Software Defect Prediction Method Based on Boosting 计算机科学, 2017, 44(8): 176-180. https://doi.org/10.11896/j.issn.1002-137X.2017.08.031 |
[12] | 甘露,臧洌,李航. 深度信念网软件缺陷预测模型 Deep Belief Network Software Defect Prediction Model 计算机科学, 2017, 44(4): 229-233. https://doi.org/10.11896/j.issn.1002-137X.2017.04.049 |
[13] | 张宇霞. Mozilla项目缺陷修复追踪关系研究 Study on Bug-fixed Traceability of Mozilla Project 计算机科学, 2017, 44(4): 21-23. https://doi.org/10.11896/j.issn.1002-137X.2017.04.005 |
[14] | 王铁建,吴飞,荆晓远. 基于多核字典学习的软件缺陷预测 Multiple Kernel Dictionary Learning for Software Defect Prediction 计算机科学, 2017, 44(12): 131-134. https://doi.org/10.11896/j.issn.1002-137X.2017.12.026 |
[15] | 陈诚,郑征,王皓钦,乔禹. 基于测试充分性准则的非死锁并发缺陷定位方法 Non-deadlock Concurrency Fault Localization Approach Based on Adequate Test Criteria 计算机科学, 2017, 44(11): 195-201. https://doi.org/10.11896/j.issn.1002-137X.2017.11.030 |
|