计算机科学 ›› 2020, Vol. 47 ›› Issue (12): 35-41.doi: 10.11896/jsjkx.200100022
所属专题: 复杂系统的软件工程和需求工程
周凯, 任怡, 汪哲, 管剑波, 张芳, 赵言亢
ZHOU Kai, REN Yi, WANG Zhe, GUAN Jian-bo, ZHANG Fang, ZHAO Yan-kang
摘要: 软件缺陷(Bug)是造成系统失效的主要原因之一为了更好地开发软件与修复软件失效需要对缺陷的分布等特征有更好的理解.Ubuntu是一款得到广泛应用的开源软件也是Linux操作系统当前在全球最成功的发行版之一.利用缺陷报告来发掘软件缺陷特征对缺陷进行合理分类并分析操作系统常见缺陷的分布规律及特点对于基于Ubuntu的国产混源操作系统开发、测试及维护过程中的代码质量分析及提升具有重要参考价值.首先获取Launchpad上32805份Ubuntu操作系统的缺陷报告.然后采用主题模型分析Ubuntu上常见的缺陷并结合操作系统的组成特点将其分为内核相关异常、桌面环境异常、网络相关异常、硬件驱动相关异常以及上层应用及开发环境相关异常.进一步利用F1值对分类结果进行评估结果表明缺陷分类具有较好的准确率.最后通过分析缺陷报告统计结果得到Ubuntu操作系统的近期缺陷的一般分布规律和特点同时通过缺陷报告的分类结果得到了有助于进一步认知Ubuntu操作系统缺陷的相关发现和结论.
中图分类号:
[1] GitStatus-linux[EB/OL].(2018-09-15)[2019-10-05].https://phoronix.com/misc/linux-20180915/index.html. [2] Synopsys.2019 Open Source Security and Risk Analysis (OSSRA) Report[OL].(2019-04-02).https://www.synopsys.com/content/dam/synopsys/sig-assets/reports/rep-ossra-19.pdf. [3] LI H,GAO G,CHEN R,et al.The Influence Ranking for Testers in Bug Tracking Systems[J].International Journal of Software Engineering &Knowledge Engineering,2019,29(1):93-113. [4] DANG Y,WU R,ZHANG H,et al.Rebucket:a method forclustering duplicate crash reports based on call stacksimilarity[C]//Proceedings of the ACM/IEEE International Conference on Software Engineering.Zurich,2012:1084-1093. [5] BETTENBURG N,PREMRAJ R,ZIMMERMANNT,et al.Duplicate bug reports considered harmful...really?[C]//Proceedings of the IEEE International Conference on Software Maintenance.Beijing,2008:337-345. [6] CAVALCANTI Y,MOTA SILVEIRA NETO P,LUCRDIO D,et al.The bug report duplication problem:an exploratory study[J].Software Quality Journal,2013,21:39-66. [7] CHEN M,HU D Y,WANG T,et al.Using Document Embedding Techniques for Similar Bug ReportsRecommendation[C]//9th IEEE International Conference on Software Engineering and Service Science (ICSESS 2018).Beijing,2018. [8] HU D Y,CHEN M,WANG T,et al.Recommending SimilarBug Reports:A Novel Approach Using Document Embedding Model[C]//APSEC.2018:725-726. [9] BOISSELLE V,ADAMS B.The impact of cross-distributionbug duplicates,empirical study on Debian and Ubuntu[C]//IEEE International Working Conference on Source Code Analysis &Manipulation.IEEE,2015. [10] LI Z,TAN L,WANG X,et al.Have Things Changed Now? An Empirical Study of Bug Characteristics in Modern Open Source Software[C]//Workshop on Architectural &System Support for Improving Software Dependability.DBLP,2006:25-33. [11] TAN L,LIU C,LI Z,et al.Bug characteristics in open source software[J].Empirical Software Engineering,2014,19(6):1665-1705. [12] REN X,HUANG Q,XIA X,et al.Characterizing Common and Domain-Specific Package Bugs:A Case Study on Ubuntu[C]//2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC).IEEE,2018:426-431. [13] Bug life cycle[EB/OL].(2009-03-03)[2019-10-05].https://dev.Launchpad.net/BugTriage/Draft?#Bug_life_cycle. [14] Launchpad[EB/OL].[2019-12-20].https://bugs.launchpad-.net/ubuntu. [15] BLEI D M,NG A Y,LATENT M I.Dirichlet allocation[J].Journal of MachineLearning Research,2003,3:993-1022. [16] Snowball[EB/OL].(2001-11)[2019-10-06].http://snowball.tartarus.org/algorithms/english/stop.txt. [17] MIMNO D M,WALLACH H M,TALLEYE M,et al.Optimizing Semantic Coherence in Topic Models[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing(EMNLP '11).2011:262-272. [18] ROEDER M,BOTH A,HINNEBURG A.Exploring the space of topic coherence measures[C]//Proceedings of the Eighth ACM International Conference on Web Search and Data Mining.2015:399-408. [19] GUO S,CHEN R,LI H,et al.Identify Severity Bug Report with Distribution Imbalance by CR-SMOTE and ELM[J].International Journal of Software Engineering and Knowledge Engineering,2019,29(2):139-175. [20] AHMED M F,GOKHALE S S.Linux bugs:Life cycle,resolution and architectural analysis[J].Information and Software Technology ,2009,51(11):1618-1627. |
[1] | 王俊, 王修来, 庞威, 赵鸿飞. 面向科技前瞻预测的大数据治理研究 Research on Big Data Governance for Science and Technology Forecast 计算机科学, 2021, 48(9): 36-42. https://doi.org/10.11896/jsjkx.210500207 |
[2] | 段文静, 姜瑛. 基于用户反馈的APP软件缺陷识别 Defect Recognition of APP Software Based on User Feedback 计算机科学, 2020, 47(6): 44-50. https://doi.org/10.11896/jsjkx.191100133 |
[3] | 王胜, 张仰森, 张雯, 蒋玉茹, 张睿. 基于SL-LDA的领域标签获取方法 Domain Label Acquisition Method Based on SL-LDA Model 计算机科学, 2020, 47(11): 95-100. https://doi.org/10.11896/jsjkx.190900012 |
[4] | 邱先标, 陈笑蓉. 一种基于SA_LDA模型的文本相似度计算方法 Text Similarity Calculation Algorithm Based on SA_LDA Model 计算机科学, 2018, 45(6A): 106-109. |
[5] | 王振飞,刘凯莉,郑志蕴,王飞. 面向时间序列的微博话题演化模型研究 Research on Evolution Model of Microblog Topic Based on Time Sequence 计算机科学, 2017, 44(8): 270-273. https://doi.org/10.11896/j.issn.1002-137X.2017.08.046 |
[6] | 李然,张华平,赵燕平,商建云. 基于主题模型与信息熵的中文文档自动摘要技术研究 Automatic Text Summarization Research Based on Topic Model and Information Entropy 计算机科学, 2014, 41(Z11): 298-300. |
[7] | 周利娟,林鸿飞,闫俊. 基于TLDA和SVSM的音乐信息检索模型 Tags Know You Better:A New Approach to Enhancing MIR System 计算机科学, 2014, 41(2): 174-178. |
[8] | 王斌,吴太文,胡培培. 软件缺陷分类和分析研究 Research on Software Defect Classification and Analysis 计算机科学, 2013, 40(9): 16-20. |
[9] | 卢露,丁才昌. 社区中最具影响力博客的探测模型 Model of Identifying the Influentials in Blog Community 计算机科学, 2011, 38(Z10): 165-168. |
[10] | 张晓艳,王挺,梁晓波. LDA模型在话题追踪中的应用 Use of LDA Model in Topic Tracking 计算机科学, 2011, 38(Z10): 136-139. |
[11] | 李宁,李战怀. 软件缺陷数据处理研究综述 Overview of Software Defect Data Processing Research 计算机科学, 2009, 36(8): 21-25. |
|