Computer Science ›› 2016, Vol. 43 ›› Issue (6): 179-183.doi: 10.11896/j.issn.1002-137X.2016.06.036

Previous Articles     Next Articles

Extraction Approach for Software Bug Report

LIN Tao, GAO Jian-hua, FU Xue, MA Yan and LIN Yan   

  • Online:2018-12-01 Published:2018-12-01

Abstract: Bug reports in software engineering areincreasing rapidly,and developers are bewildered by the large number accumulation of reports.Therefore,it is necessary to study on the extraction of bug reports for the task of bug fixing and software reuse,etc.This paper proposed a novel extraction approach.Synonyms are merged into one specific word firstly in the approach.Then it sets up a vector space model.And some text mining methods,such as TF-IDF and information gain,are used to collect word features in bug reports specifically.Meanwhile,there is an algorithm for determining sentence complexity,so as to choose long sentences.Finally Bayes classifier is introduced to bug report extraction.TPR is increased and FPR is decreased in this approach.The experiment proves that the bug report extraction based on text mining and Bayes classifier is competitive in the evaluation of AUC(0.71),F-score(0.80) and Kappa value(0.75).

Key words: Bug report management,Text mining,Bayes classifier,Bug report feature,Vector space model,Sentence complexity

[1] Goyal P,Behera L,Mcginnity T M.A Context-Based Word Indexing Model for Document Summarization[J].IEEE Transactions on Knowledge and Data Engineering,2013,25(8):1693-1705
[2] Mills M T,Bourbakis N G.Graph-Based Methods for Natural Language Processing and Understanding—A Survey and Analysis [J].IEEE Transactions on Systems,Man,and Cybernetics:Systems,2014,44(1):59-71
[3] Alenezi M,Banitaan S.Bug Reports Prioritization:Which Features and Classifier to Use?[C]∥12th International Conference on Machine Learning and Applications(ICMLA).Miami,FL,2013:112-116
[4] Kastner C,Dreiling A,Ostermann K.Variability Mining:Consistent Semi-automatic Detection of Product-Line Features[J].IEEE Transactions on Software Engineering,2014,40(1):67-82
[5] Rastkar S,Murphy G C,Murray G.Automatic Summarization of Bug Reports[J].IEEE Transactions on Software Engineering,2014,40(4):366-380
[6] Chen Xuan,Liu Jian,Feng Xin-qi,et al.Differential Private Synthesis Dataset Releasing Algorithm Based on Navie Bayes[J].Computer Science,2015,2(1):236-238(in Chinese) 陈旋,刘健,冯新淇,等.基于朴素贝叶斯的差分隐私合成数据集发布算法[J].计算机科学,2015,42(1):236-238
[7] Rastkar S,Murphy G C.Summarizing Software Artifacts[EB/OL].[2015-04-16].https://www.cs.ubc.ca/cs-research/software-practices-lab/projects/summarizing-software-artifacts
[8] Lee S,Baker J,Song J,et al.An Empirical Comparison of Four Text Mining Methods[J]Journal of Computer Information System,2010,1(1):1-10
[9] Saari P,Eerola T.Semantic Computing of Moods Based on Tags in Social Media of Music[J].IEEE Transactions on Knowledge and Data Engineering,2014,26(10):2548-2560
[10] Mishra A,Singh G.Improving keyphrase extraction by usingdocument topic information[C]∥IEEE International Conference on Granular Computing (GrC).Kaohsiung,2011:463-467
[11] Wijayasekara D,Manic M,Mcqueen M.Information gain based dimensionality selection for classifying text documents[C]∥IEEE Congress on Evolutionary Computation (CEC).Cancun,2013:440-445
[12] Kuan-Yu C,Shih-Hung L,Chen B,et al.A recurrent neural network language modeling framework for extractive speech summarization[C]∥IEEE International Conference on Multimedia and Expo (ICME).Chengdu,2014:1-6
[13] Loria S.TextBlob[EB/OL].[2015-04-16].http://textblob.readthedocs.org/en/dev

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!