Computer Science ›› 2020, Vol. 47 ›› Issue (12): 35-41.doi: 10.11896/jsjkx.200100022

Special Issue: Software Engineering & Requirements Engineering for Complex Systems

Previous Articles     Next Articles

Classification and Analysis of Ubuntu Bug Reports Based on Topic Model

ZHOU Kai, REN Yi, WANG Zhe, GUAN Jian-bo, ZHANG Fang, ZHAO Yan-kang   

  1. College of Computer National University of Defense Technology Changsha 410073,China
  • Received:2020-01-05 Revised:2020-05-30 Online:2020-12-15 Published:2020-12-17
  • About author:ZHOU Kai,born in 1996postgraduateis a member of China Computer Federation.His main research interests include system software and software engineering.
    REN Yi,born in 1977Ph.Dresearcher.Her main research interests include operating systemcloud computing and virtualizationdistributed computinglarge-scale mixed-source software code feature analysis and so on.
  • Supported by:
    National Natural Science Foundation of China(61872444) and National High-End Generic Chips and Basic Software Project of China(2017ZX01038104-002).

Abstract: Software bug is the main cause of system failure.Better understanding of bug characteristics is needed to develop software and repairing failure.Ubuntu is one of the most successful distributions of the Linux operating system and also a popular open-source software platform in the world.Using bug reports to discover software bug characteristicsanalyze and classify reasonably common bugs of the operating systemhas important guiding value for the bug analysis during the developmenttesting and maintenance of the domestic mixed source operating system based on Ubuntu.Firstly32805 bug reportsare downloaded from launchpad through crawler.Though analyzing the common bug of Ubuntu by using topic modebug are divided into 5 categories:kernel relateddesktop environmentnetworkhardware driver related anomaly and the abnormal system management based on Ubuntu operating system composition and experience.Nextthe results of the classification are evaluated through F1 value.Finallythe general distribution rules and characteristics of the recent bugs in the Ubuntu operating system are obtained by analyzing the statistical results of the bug reports.At the same timethrough further analysis of the classification resultsrelevant findings and conclusions that help to further understand the bugs of Ubuntu operating system are obtained.

Key words: Bug classification, Bug report analysis, Latent dirichlet allocation model, Ubuntu operating system

CLC Number: 

  • TP311
[1] GitStatus-linux[EB/OL].(2018-09-15)[2019-10-05].https://phoronix.com/misc/linux-20180915/index.html.
[2] Synopsys.2019 Open Source Security and Risk Analysis (OSSRA) Report[OL].(2019-04-02).https://www.synopsys.com/content/dam/synopsys/sig-assets/reports/rep-ossra-19.pdf.
[3] LI H,GAO G,CHEN R,et al.The Influence Ranking for Testers in Bug Tracking Systems[J].International Journal of Software Engineering &Knowledge Engineering,2019,29(1):93-113.
[4] DANG Y,WU R,ZHANG H,et al.Rebucket:a method forclustering duplicate crash reports based on call stacksimilarity[C]//Proceedings of the ACM/IEEE International Conference on Software Engineering.Zurich,2012:1084-1093.
[5] BETTENBURG N,PREMRAJ R,ZIMMERMANNT,et al.Duplicate bug reports considered harmful...really?[C]//Proceedings of the IEEE International Conference on Software Maintenance.Beijing,2008:337-345.
[6] CAVALCANTI Y,MOTA SILVEIRA NETO P,LUCRDIO D,et al.The bug report duplication problem:an exploratory study[J].Software Quality Journal,2013,21:39-66.
[7] CHEN M,HU D Y,WANG T,et al.Using Document Embedding Techniques for Similar Bug ReportsRecommendation[C]//9th IEEE International Conference on Software Engineering and Service Science (ICSESS 2018).Beijing,2018.
[8] HU D Y,CHEN M,WANG T,et al.Recommending SimilarBug Reports:A Novel Approach Using Document Embedding Model[C]//APSEC.2018:725-726.
[9] BOISSELLE V,ADAMS B.The impact of cross-distributionbug duplicates,empirical study on Debian and Ubuntu[C]//IEEE International Working Conference on Source Code Analysis &Manipulation.IEEE,2015.
[10] LI Z,TAN L,WANG X,et al.Have Things Changed Now? An Empirical Study of Bug Characteristics in Modern Open Source Software[C]//Workshop on Architectural &System Support for Improving Software Dependability.DBLP,2006:25-33.
[11] TAN L,LIU C,LI Z,et al.Bug characteristics in open source software[J].Empirical Software Engineering,2014,19(6):1665-1705.
[12] REN X,HUANG Q,XIA X,et al.Characterizing Common and Domain-Specific Package Bugs:A Case Study on Ubuntu[C]//2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC).IEEE,2018:426-431.
[13] Bug life cycle[EB/OL].(2009-03-03)[2019-10-05].https://dev.Launchpad.net/BugTriage/Draft?#Bug_life_cycle.
[14] Launchpad[EB/OL].[2019-12-20].https://bugs.launchpad-.net/ubuntu.
[15] BLEI D M,NG A Y,LATENT M I.Dirichlet allocation[J].Journal of MachineLearning Research,2003,3:993-1022.
[16] Snowball[EB/OL].(2001-11)[2019-10-06].http://snowball.tartarus.org/algorithms/english/stop.txt.
[17] MIMNO D M,WALLACH H M,TALLEYE M,et al.Optimizing Semantic Coherence in Topic Models[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing(EMNLP '11).2011:262-272.
[18] ROEDER M,BOTH A,HINNEBURG A.Exploring the space of topic coherence measures[C]//Proceedings of the Eighth ACM International Conference on Web Search and Data Mining.2015:399-408.
[19] GUO S,CHEN R,LI H,et al.Identify Severity Bug Report with Distribution Imbalance by CR-SMOTE and ELM[J].International Journal of Software Engineering and Knowledge Engineering,2019,29(2):139-175.
[20] AHMED M F,GOKHALE S S.Linux bugs:Life cycle,resolution and architectural analysis[J].Information and Software Technology ,2009,51(11):1618-1627.
[1] ZHOU Li-juan,LIN Hong-fei and YAN Jun. Tags Know You Better:A New Approach to Enhancing MIR System [J]. Computer Science, 2014, 41(2): 174-178.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!