Computer Science ›› 2013, Vol. 40 ›› Issue (5): 164-167.

Previous Articles     Next Articles

Statistically Significant Sequential Pattern Mining Applying to Software Defect Prediction

TANG Lei,LI Chun-ping and YANG Liu   

  • Online:2018-11-16 Published:2018-11-16

Abstract: Nowadays the human beings are more and more reliant on software systems which have high reliability and usability,and the technology of software defect prediction has been one of the most active parts of software engineering.This paper introduced the technology of software defect prediction on the basis of sequential pattern mining and designed a model for software defect prediction with the technology of mining statistically significant pattern.It described the architecture and detailed implementation of the algorithms named “InfoMiner” and “STAMP”.The model using InfoMiner and STAMP to mine patterns,chi-square test to feature selection and SVM to classify can find unknown defects with high probability.Experimental results show that the model is able to get high prediction accuracy,so that it is valua-ble and has future prospects.

Key words: Data mining,Sequential pattern,Software defect,Information gain,Classification and prediction

[1] Agrawal R,Srikant R.Mining sequential patterns[C]∥Procee-dings of the Eleventh International Conference on Data Engineering.Washington DC,USA:IEEE Computer Society,1995:3-14
[2] Chen Yuan,Shen Xiang-heng,Du Peng,et al.Research on Software Defect Prediction Based on Data Mining[C]∥The 2nd International Conference on Computer and Automation Enginee-ring.Singapore:ICCAE,2010:563-567
[3] Yang Jiong,Wang Wei, Yu P S.Infominer:mining surprising periodic patterns [C]∥Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD’01).New York,USA:ACM,2001:395-400
[4] Yang Jiong,Wang Wei,Yu P S.STAMP :on discovery of statistically important pattern repeats in long sequential data[C]∥Proceedings of the Third SIAM International Conference on Data Mining(SDM’03).San Francisco,CA,USA:SIAM,2003:224-238
[5] 张小康.基于数据挖掘和机器学习的恶意代码检测技术研究[D].合肥:中国科学技术大学,2009
[6] 周聚.基于网络信息审计的文本过滤的研究与实现[D].苏州:苏州大学,2010
[7] 杨明,张载鸿.决策树学习算法ID3的研究[J].微机发展,2002,2(5):6-9
[8] Quinlan J R.C4.5:Programs for Machine Learning[M].San Francisco:Morgan Kaufmann Publishers,1993
[9] 眭俊明,姜远,周志华,等.基于频繁项集挖掘的贝叶斯分类算法[J].计算机研究与发展,2007,4(8):1293-1300
[10] Han Jia-wei,Kamber M.Data Mining:Concepts and Techniques [M].San Francisco:Morgan Kaufmann Publishers,2006
[11] Lo D,Cheng Hong,Han Jia-wei,et al.Classification of Software Behaviors for Failure Detection:A Discriminative Pattern Mi-ning Approach [C]∥Proceedings of the 15th ACM SIGKDD(KDD’09).New York,USA:ACM,2009:557-565

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!