Computer Science ›› 2022, Vol. 49 ›› Issue (12): 89-98.doi: 10.11896/jsjkx.220200181

• Computer Software • Previous Articles     Next Articles

Empirical Study on Defects in R Programming Language and Core Packages

WANG Zi-yuan, BU De-xin, LI Ling-ling, ZHANG Xia   

  1. School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210023,China
  • Received:2022-02-27 Revised:2022-06-03 Published:2022-12-14
  • About author:WANG Zi-yuan,born in 1982,Ph.D,associate professor,master supervisor,is a member of China Computer Federation.His main research interests include software testing and so on.
  • Supported by:
    National Natural Science Foundation of China(61772259).

Abstract: The R programming language that provides a variety of statistical calculation functions is considered to be one of the programming languages most suitable for artificial intelligence.The correctness of the language implementation is a prerequisite for the correctness of the programs developed with such a language.However,there are inevitably many defects in the R programming language.This paper conducts an empirical study on defects in the R programming language and its core packages.By analyzing 7020 issues,we find that:1) Among all the 35 versions involved in these defects,there are the most defects in R 3.1.2,R 3.0.2 and R 3.5.0,and these defects are primarily distributed in a few components such as Documentation,Graphics,Language.2) The components with higher overall defect priority include Startup,Installation and Analyses,and the components with higher overall defect severity include I/O,Installation and Accuracy.There is a significant intermediate correlation between the priority and severity of the defects.3) About 78% of defects could be repaired within one year.4) Semantic faults are the most frequent root cause of defects,in which the “missing feature” and “processing” are more than others.These findings reveal some laws of defects in the R programming language and its core packages.It can assist developers of the R programming language in improving their development quality,assist maintainers of the R programming language in detecting and repairing defects more effectively,and suggest users of the R programming language evade potential risks.

Key words: R programming language, Empirical study, Software defect, Distribution of defects, Defect repair, Root cause

CLC Number: 

  • TP311
[1]TAN L,LIU C,LI Z M,et al.BugCharacteristics in Open Source Software [J].Empirical Software Engineering,2014,19(6):1665-1705.
[2]WAN Z Y,LO D,XIA X,et al.Bug Characteristics in Blockchain Systems:A Large-Scale Empirical Study [C]//Procee-dings of the IEEE/ACM 14th International Conference on Mi-ning Software Repositories(MSR 2017).2017:413-424.
[3]RAZZAQ S,LI Y F,LIN C T,et al.A Study of the Extraction of Bug Judgment and Correction Times from Open Source Software Bug Logs [C]//Proceedings of the IEEE International Conference on Software Quality,Reliability and Security Companion(QRS-C 2018).2018:229-234.
[4]BHATTACHARYA P,ULANOVA L,NEAMTIU I,et al.An Empirical Analysis of Bug Reports and Bug Fixing in Open Source Android Apps [C]//Proceedings of the 17th European Conference on Software Maintenance and Reengineering.2013:133-143.
[5]SAHA R K,KHURSHID S,PERRY D E.An Empirical Study of Long Lived Bugs [C]//Proceedings of the Software Evolution Week-IEEE Conference on Software Maintenance,Reenginee-ring,and Reverse Enginee-ring(CSMR-WCRE 2014).2014:144-153.
[6]YUE R R,MENG N,WANG Q X.A Characterization Study of Repeated Bug Fixes [C]//Proceedings of the IEEE Interna-tional Conference on Software Maintenance and Evolution(ICSME 2017).2017:422-432.
[7]ZIMMERMANN T,NAGAPPAN N,GUO P J,et al.Characte-rizing and Predicting Which Bugs Get Reopened [C]//Procee-dings of the 34th International Conference on Software Enginee-ring(ICSE 2012).2012:1074-1083.
[8]SUN C N,DU J,CHEN N,et al.Mining Explicit Rules for Software Process Evaluation [C]//Proceedings of the International Conference on Software and System Process(ICSSP 2013).2013:118-125.
[9]CHEN N,HOI S C H,XIAO X K.Software Process Evaluation:A Machine Learning Approach [C]//Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering(ASE 2011).2011:333-342.
[10]SUN C N,LE V,ZHANG Q R,et al.Toward UnderstandingCompiler Bugs in GCC and LLVM [C]//Proceedings of the 25th International Symposium on Software Testing and Analysis(ISSTA 2016).2016:294-305.
[11]SAHOO S K,CRISWELL J,ADVE V.An Empirical Study of Reported Bugs in Server Software with Implications for Automated Bug Diagnosis [C]//Proceedings of the ACM/IEEE 32nd International Conference on Software Engineering(ICSE 2010).2010:1-10.
[12]LE V,SUN C N,SU Z D.Finding Deep Compiler Bugs via Guided Stochastic Program Mutation [C]//Proceedings of the ACM SIGPLAN International Conference on Object-Oriented Programming,Systems,Languages,and Applications(OOPSLA 2015).2015:386-399.
[13]ZAMAN S,ADAMS B,E.HASSAN A.Security Versus Performance Bugs:A Case Study on Firefox [C]//Proceedings of the 8th Working Conference on Mining Software Repositories(MSR 2011).2011:93-102.
[14]VUIJAYAKUMAR K,BHUVANESWARI V.How Much Effort Needed to Fix the Bug? A Data Mining Approach for Effort Estimation and Analysing of Bug Report Attributes in Firefox [C]//Proceedings of the International Conference on Intelligent Computing Applications.2014:335-339.
[15]LI F,PAXSON V.A Large-Scale Empirical Study of Security Patches [C]//Proceedings of the ACM Conference on Compu-ter and Communications Security(CCS 2017).2017:2201-2215.
[16]HANAM Q,BRITO F S D M,MESBAH A.Discovering Bug Patterns in JavaScript [C]//Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering(FSE 2016).2016:144-156.
[17]NGUYEN T,VU P M,NGUYE T T.An Empirical Study of Exception Handling Bugs and Fixes [C]//Proceedings of the ACM Southeast Conference(ACMSE 2019).2019:257-260.
[18]SUN X B,ZHOU T C,LI G J,et al.An Empirical Study on Real Bugs for Machine Learning Programs [C]//Proceedings of the 24th Asia-Pacific Software Engineering Conference(APSEC 2017).2017:348-357.
[19]ZHANG Y H,CHEN Y F,CHEUNG S C,et al.An Empirical Study on TensorFlow Program Bugs [C]//Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis(ISSTA 2018).2018:129-140.
[20]ZHANG R,XIAO W C,ZHANG H Y,et al.An Empirical Stu-dy on Program Failures of Deep Learning Jobs [C]//Procee-dings of the 42nd International Conference on Software Enginee-ring(ICSE 2020).2020:1159-1170.
[21]ISLAM M J,NGUYEN G,PAN R,et al.A ComprehensiveStudy on Deep Learning Bug Characteristics [C]//Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering(ESEC /FSE 2019).2019:510-520.
[22]DU X T,XIAO G P,SUI Y L.Fault Triggers in the TensorFlow Framework:An Experience Report [C]//Proceedings ofthe IEEE 31st International Symposium on Software Reliability Engineering(ISSRE 2020).2020:1-12.
[23]GRISHMA B R,ANJALI C.Software Root Cause Prediction using Clustering Techniques:A Review [C]//Proceedings of Global Conference on Communication Technologies(GCCT 2015).2015:511-515.
[24]HIRSCH T,HOFER B.Root Cause Prediction Based on BugReports [C]//Proceedings of the 31st IEEE International Symposium on Software Reliability Engineering Workshops(ISSREW 2020).2020:171-176.
[25]LAL H,HOFER B,PAHWA G.Root Cause Analysis of Software Bugs using Machine Learning Techniques [C]//Procee-dings of 7th International Conference on Cloud Computing,Data Science & Engineering-Confluence.2017:105-111.
[26]JEFFREY D,GUPTA N,GUPTA R.Identifying the Root Causes of Memory Bugs Using Corrupted Memory Location Suppression [C]//Proceedings of the IEEE International Confe-rence on Software Maintenance(ICSM 2008).2008:356-369.
[27]THUNG F,LO D,JIANG L X.Automatic Recovery of Root Causes from Bug-Fixing Changes [C]//Proceedings of 20th Working Conference on Reverse Engineering(WCRE 2013).2013:92-101.
[28]DALAL S,CHHILLAR R S.Empirical Study of Root CauseAnalysis of Software Failure [J].ACM SIGSOFT Software Engineering Notes,2013,38(4):1-7.
[1] WANG Bo, HUA Qing-yi, SHU Xin-feng. Study on Anomaly Detection and Real-time Reliability Evaluation of Complex Component System Based on Log of Cloud Platform [J]. Computer Science, 2022, 49(12): 125-135.
[2] ZHENG Xiao-meng, GAO Meng, TENG Jun-yuan. Research on Construction Method of Defect Prediction Dataset for Spacecraft Software [J]. Computer Science, 2021, 48(6A): 575-580.
[3] TENG Jun-yuan, GAO Meng, ZHENG Xiao-meng, JIANG Yun-song. Noise Tolerable Feature Selection Method for Software Defect Prediction [J]. Computer Science, 2021, 48(12): 131-139.
[4] HU Teng, WANG Yan-ping, ZHANG Xiao-song, NIU Wei-na. Data and Behavior Analysis of Blockchain-based DApp [J]. Computer Science, 2021, 48(11): 116-123.
[5] CHEN Zheng-zhao, JIANG Ren-he, PAN Min-xue, ZHANG Tian, LI Xuan-dong. Empirical Study of Code Query Technique Based on Constraint Solving on StackOverflow [J]. Computer Science, 2019, 46(11): 137-144.
[6] QIU Shao-jian, CAIZi-yi, LU Lu. Cost-sensitive Convolutional Neural Network Model for Software Defect Prediction [J]. Computer Science, 2019, 46(11): 156-160.
[7] HU Meng-yuan, HUANG Hong-yun, DING Zuo-hua. Ensemble Model for Software Defect Prediction [J]. Computer Science, 2019, 46(11): 176-180.
[8] XUE Can-guan, YAN Xue-feng. Software Defect Prediction Based on Improved Deep Forest Algorithm [J]. Computer Science, 2018, 45(8): 160-165.
[9] CHEN Xiang, WANG Qiu-ping. Multi-objective Supervised Defect Prediction Modeling Method Based on Code Changes [J]. Computer Science, 2018, 45(6): 161-165.
[10] ZHU Chao-yang, CHEN Xiang-zhou, YAN Long and ZHANG Xin-ming. Research on Software Defect Prediction Based on AIRS Using PCA [J]. Computer Science, 2017, 44(Z6): 483-485.
[11] YANG Jie, YAN Xue-feng and ZHANG De-ping. Cost-sensitive Software Defect Prediction Method Based on Boosting [J]. Computer Science, 2017, 44(8): 176-180.
[12] GAN Lu, ZANG Lie and LI Hang. Deep Belief Network Software Defect Prediction Model [J]. Computer Science, 2017, 44(4): 229-233.
[13] ZHANG Kai, SUN Xiao-bing, PENG Xin and ZHAO Wen-yun. Empirical Study of Reopened Security Bugs on Mozilla [J]. Computer Science, 2017, 44(11): 41-49.
[14] XIONG Jing, GAO Yan and WANG Ya-yu. Software Defect Prediction Model Based on Adaboost Algorithm [J]. Computer Science, 2016, 43(7): 186-190.
[15] ZHANG De-ping, LIU Guo-qiang and ZHANG Ke. Software Defect Prediction Model Based on GMDH Causal Relationship [J]. Computer Science, 2016, 43(7): 171-176.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!