Computer Science ›› 2018, Vol. 45 ›› Issue (11A): 497-500.

• Software Engineering & Database Technology • Previous Articles     Next Articles

Long Method Detection Based on Cost-sensitive Integrated Classifier

LIU Li-qian, DONG Dong   

  1. College of Mathematics and Information Science,Hebei Normal University,Shijiazhuang 050024,China
  • Online:2019-02-26 Published:2019-02-26

Abstract: Long method is a software design problem that requires refactoring because it is too long.In order to improve the detection rate of traditional machine learning approaches on long method,a cost-sensitive integrated classifier algorithm was proposed from the viewpoint of unbalanced sample data of code smell.Based on the traditional decision tree algorithm,the under-sampling startegy is used for resampling,then a plurality of balanced subsets are generated.These subsets are trained to generate a plurality of same base classifiers.Finally,the mistaken classification cost determined by the cognitive complexity is complemented to the integrated classifier.The cost makes the classifier inclined to the accuracy rate of the minority categories.Compared with the traditional machine learning algorithm,this method has improved the precision and recall for detection result of long methods.

Key words: Code smell, Cognitive complexity, Cost-sensitive, Long method

CLC Number: 

  • TP311
[1]FOWLER M.Refactoring:Improving the Design of Existing Code [M].Lecture Notes in Computer Science,1999:256.
[2]FONTANA F A,ZANONI M,MARINO A.Comparing and Experimenting Machine Learning Techniques for Code Smell Detection[J].Empirical Software Engineering,2016,21(3):1143-1191.
[3]RAO A A,REDDY K N.Detecting Bad Smells in Object Oriented Design Using Design Change Propagation Probability Matrix[M].Lecture Notes in Engineering & Computer Science,2008.
[4]MOHA N,GUEHENEUC Y G,DUCHIEN L,et al.DECOR:A Method for the Specification and Detection of Code and Design Smells[J].IEEE Transactions on Software Engineering,2010,36(1):20-36.
[5]KOSBA E,ABDELMOEZ W,IESA A F.Risk-Based Code Smells Detection Tool[C]∥International conference on Computing Technology and Information Management.2014.
[6]刘秋荣.面向代码坏味检测的阈值动态优化方法[D].北京:北京理工大学,2016.
[7]KREIMER J.Adaptive Detection of Design Flaws[J].Electronic Notes in Theoretical Computer Science,2005,141(4):117-136.
[8]MAIGA A,ALI N,BHATTACHARYA N,et al.Support Vector Machines for Anti-pattern Detection[C]∥IEEE/ACM International Conference on Automated Software Engineering.ACM,2012:278-281.
[9]KHOMH F,VAUCHER S,SAHRAOUI H.BDTEX:A GQM-based Bayesian Approach for the Detection of Antipatterns[J].Journal of Systems & Software,2011,84(4):559-572.
[10]KHOMH F,SAHRAOUI H.A Bayesian Approach for the Detection of Code and Design Smells[C]∥International Conference on Quality Software.IEEE,2010:305-314.
[11]MALHOTRA R,KHANNA M.An empirical study for software change prediction using imbalanced data[J].Empirical Software Engineering,2017,22(6):1-46.
[12]ELKAN C.The Foundations of Cost-Sensitive Learning[C]∥Seventeenth International Joint Conference on Artificial Intelligence.2001:973-978.
[13]BAHNSEN A C,STOJANOVIC A,AOUADA D,et al.Cost Sensitive Credit Card Fraud Detection Using Bayes Minimum Risk[C]∥International Conference on Machine Learning and Applications.IEEE,2014:333-338.
[14]陶新民,刘福荣,童智靖,等.不均衡数据下基于SVM的故障检测新算法[J].振动与冲击,2010,29(12):8-12.
[15]KAI M T.Inducing Cost-sensitive Trees via Instance Weighting[C]∥European Symposium on Principles of Data Mining and Knowledge Discovery.Berlin Heidelberg:Springer-Verlag,1998:139-147.
[16]LIU X Y,ZHOU Z H.The Influence of Class Imbalance on Cost-Sensitive Learning:An Empirical Study[C]∥International Conference on Data Mining.IEEE Computer Society,2006:970-974.
[17]FELDMAN J.An Algebra of Human Concept Learning[J].Journal of Mathematical Psychology,2006,50(4):339-368.
[18]CHHABRA J K.Code Cognitive Complexity:A New Measure[M].Lecture Notes in Engineering & Computer Science,2011,2191(1).
[19]TAHIR M A,KITTLER J,MIKOLAJCZYK K,et al.A Multi-ple Expert Approach to the Class Imbalance Problem Using Inverse Random under Sampling[C]∥International Workshop on Multiple Classifier Systems.Berlin Heidelberg:Springer-Verlag,2009:82-91.
[20]PHUA C,ALAHAKOON D,LEE V.Minority Report in Fraud Detection:Classification of Skewed Data[J].Acm Sigkdd Explorations Newsletter,2004,6(1):50-59.
[21]LAURIKKALA J.Improving Identification of Difficult Small Classes by Balancing Class Distribution[C]∥Conference on AI in Medicine in Europe:Artificial Intelligence Medicine.Berlin Heidelberg:Springer-Verlag,2001:63-66.
[1] LI Jing-tai, WANG Xiao-dan. XGBoost for Imbalanced Data Based on Cost-sensitive Activation Function [J]. Computer Science, 2022, 49(5): 135-143.
[2] HUANG Ying-qi, CHEN Hong-mei. Cost-sensitive Convolutional Neural Network Based Hybrid Method for Imbalanced Data Classification [J]. Computer Science, 2021, 48(9): 77-85.
[3] WANG Ji-wen, WU Yi-jian, PENG Xin. Approach of God Class Detection Based on Evolutionary and Semantic Features [J]. Computer Science, 2021, 48(12): 59-66.
[4] WU Chong-ming, WANG Xiao-dan, XUE Ai-Jun and LAI Jie. Multiclass Cost-sensitive Classification Based on Error Correcting Output Codes [J]. Computer Science, 2020, 47(6A): 89-94.
[5] MENG Fan-yi, WANG Ying, YU Hai, ZHU Zhi-liang. Refactoring of Complex Software Systems Research:PresentProblem and Prospect [J]. Computer Science, 2020, 47(12): 1-10.
[6] WU Yu-xi, WANG Jun-li, YANG Li, YU Miao-miao. Survey on Cost-sensitive Deep Learning Methods [J]. Computer Science, 2019, 46(5): 1-12.
[7] QIU Shao-jian, CAIZi-yi, LU Lu. Cost-sensitive Convolutional Neural Network Model for Software Defect Prediction [J]. Computer Science, 2019, 46(11): 156-160.
[8] CAI Zi-xin, WANG Xin-yue, XU Jian, JING Li-ping. Sample Adaptive Classifier for Imbalanced Data [J]. Computer Science, 2019, 46(1): 94-99.
[9] XING Ying, LI De-yu, WANG Su-ge. Cost-sensitive Sequential Three-way Decision Making Method [J]. Computer Science, 2018, 45(10): 6-10.
[10] MA Sai and DONG Dong. Detection of Large Class Based on Latent Semantic Analysis [J]. Computer Science, 2017, 44(Z6): 495-498.
[11] SHI Yan-wen and WANG Hong-jie. Cost-sensitive Random Forest Classifier with New Impurity Measurement [J]. Computer Science, 2017, 44(Z11): 98-101.
[12] YANG Jie, YAN Xue-feng and ZHANG De-ping. Cost-sensitive Software Defect Prediction Method Based on Boosting [J]. Computer Science, 2017, 44(8): 176-180.
[13] AN Chun-lin,LU Hui-juan,WEI Sha-sha and YANG Xiao-bing. Dissimilarity Based Ensemble of Extreme Learning Machine with Cost-sensitive for Gene Expression Data Classification [J]. Computer Science, 2014, 41(12): 211-215.
[14] RUAN Xiao-hong,HUANG Xiao-meng,YUAN Ding-rong and DUAN Qiao-ling. Classification Algorithm Based on Heterogeneous Cost-sensitive Decision Tree [J]. Computer Science, 2013, 40(Z11): 140-142.
[15] . [J]. Computer Science, 2007, 34(9): 139-141.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!