分类学习算法的性能度量指标综述

doi:10.11896/jsjkx.200900216

Abstract

Abstract: In the research of classification task of machine learning,it is important for correctly evaluating the performance of the learning algorithm.In practical application,many performance measure indexes are proposed based on different perspectives.Three kinds of performance measure indexes based on error rate,confusion matrix and statistical test are introduced in this paper.The background,significance and scope of each measure index are discussed.The differences of different methods are analyzed.The future research problems and directions are also put forward and analyzed.Furthermore,the differences of these performance measure indexes are also compared by experimental data in portrait and landscape.The consistency of these performance measure indexes is also analyzed in classification algorithm selection.

Key words: Confusion matrix, Error rate, Performance measure, Statistical test

CLC Number:

TP181

YANG Xing-li. Survey for Performance Measure Index of Classification Learning Algorithm[J].Computer Science, 2021, 48(8): 209-219.

References

[1]ZHANG X G.Pattern Recognition[M].Beijing:Tsinghua University Press,2010.
[2]BIAN Z Q.Pattern Recognition[M].Beijing:Tsinghua University Press,1988.
[3]DUDA R O,HART P E,STORK D G.Pattern Classification [M].New York:Springer,2001.
[4]HASTIE T,TIBSHRANI R,FRIEDMAN J.The Elements ofStatistical Learning:Data Mining,Inference,and Prediction [M].New York:Springer,2001.
[5]VAPNIK V N.The Nature of Statistical Learning Theory[M].New York:Springer-Verlag,1999.
[6]FERRI C,HERNANDEZ-ORALLO J,MODROIU R.An Experimental Comparison of Performance Measures for Classification[J].Pattern Recognition Letters,2009,30(1):27-38.
[7]SOKOLOVA M,LAPALME G.A Systematic Analysis of Performance Measures for Classification Tasks [J].Information Processing & Management,2009,45(4):427-437.
[8]WEBB A R,COPSEY K D.Introduction to Statistical PatternRecognition [M].Academic Press,1972:2133-2143.
[9]TURNER K,GHOSH J.Estimating the Bayes Error Ratethrough Classifier Combining[C]//International Conference on Pattern Recognition.IEEE Computer Society,1996:695-699.
[10]BREIMAN L.The Little Bootstrap and other Methods for Dimensionality Selection in Regression:X-Fixed Prediction Error [J].Journal of the American Statistical Association,1992,87(419):738-754.
[11]SHAO J.Bootstrap Model Selection[J].Publications of theAmerican Statistical Association,1996,91(434):655-665.
[12]LOPES M E,WANG S,MAHONEY M W.A Bootstrap Method for Error Estimation in Randomized Matrix Multiplication [J].Journal of Machine Learning Research,2019,20:1-40.
[13]BRADLEY E.Prediction,Estimation,and Attribution[J].Journal of the American Statistical Association,2020,115(530):636-655.
[14]YILDIZ O T,ÖZLEM A,AIPAYDIN E.Multivariate Statistical Tests for Comparing Classification Algorithms[C]//International Conference on Learning and Intelligent Optimization.Springer-Verlag,2011:1-15.
[15]GOUTTE C,GAUSSIER E.A Probabilistic Interpretation ofPrecision,Recall and F-Score,with Implication for Evaluation [J].International Journal of Radiation Biology & Related Stu-dies in Physics Chemistry & Medicine,2005,51(5):345-359.
[16]POWERS D M W.Evaluation:From Precision,Recall and F-measure to ROC,Informedness,Markedness and Correlation [J].Journal of Machine Learning Technology,2011,2:37-63.
[17]WANG Y,LI J H,LI Y F,et al.Confidence Interval for F1 Measure of Algorithm Performance based on Blocked 3×2 Cross-validation [J].IEEE Transactions on Knowledge & Data Engineering,2015,27(3):651-659.
[18]MUSCHELLI J.ROC and AUC with a Binary Predictor:a Potentially Misleading Metric [J].Journal of Classification,2020,37(3):696-708.
[19]FAWCETT T.An Introduction to ROC Analysis [J].Pattern Recognition Letters,2006,27(8):861-874.
[20]FLACH P A.The Geometry of ROC Space:Understanding Machine Learning Metrics through ROC Isometrics[C]//Machine Learning,Proceedings of the Twentieth International Confe-rence.DBLP,2003:194-201.
[21]LOBO J M,JIMENEZ-VALVERDE A,REAL R.AUC:a Misleading Measure of the Performance of Predictive Distribution Models [J].Global Ecology & Biogeography,2008,17(2):145-151.
[22]DIETTERICH T G.Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms [J].Neural Computation,1998,10(7):1895-1923.
[23]YANG L,WANG Y.Analysis of Variance of F1 Measure based on Blocked 3×2 Cross Validation [J].Journal of Frontiers of Computer Science and Technology,2016,10(8):1176-1183.
[24]WANG Y,WANG R B,JIA H C,et al.Blocked 3×2 Cross-validated t-test for Comparing Supervised Classification Learning Algorithms [J].Neural Computation,2014,26(1):208-235.
[25]WANG Y,LI J H.Credible Intervals for Precision and Recall Based on a K-Fold Cross-Validated Beta Distribution [J].Neural Computation,2016,28(8):1694-1722.
[26]BISANI M,NEY H.Bootstrap Estimates for Confidence Intervals in ASR Performance Evaluation[C]//IEEE International Conference on Acoustics,Speech,and Signal Processing,2004.
[27]LIU Y Q,WANG Y,LI J H.Model Selection Algorithm based on Blocked 3×2 Cross-validated t-test [J].Journal of Shanxi University of Science & Technology,2015,33(1):179-183.
[28]ZADROZNY B,ELKAN C.Obtaining Calibrated ProbabilityEstimates from Decision Trees and Naive Bayesian Classifiers[C]//Proceedings of the 18^th International Conference on Machine Learning (ICML).2001:609-616.
[29]CORTES C,MOHRI M.AUC Optimization vs. Error Rate Mi-nimization[C]//Advances in Neural Information Processing Systems 16(NIPS 2003).2003:313-320.
[30]ROSSET S.Model Selection via the AUC[C]//Machine Lear-ning Proceedings of the 21st International Conference (ICML).2004:89-96.
[31]FLACH P A.The Geometry of ROC Space:UnderstandingMachine Learning Metrics through ROC Isometrics[C]//Machine Learning,Proceedings of the 20^th International Conference (ICML).2003:194-201.
[32]FUERNKRANZ J,FLACH P A.ROC ‘n' Rule Learning-towards a Better Understanding of Covering Algorithms [J].Mach.Learn.,2005,58(1):39-77.
[33]BUJA A,STUETZLE W,SHEN Y.Loss Functions for Binary Class Probability Estimation:Structure and Applications [EB/OL].[2005-11-03].http://stat.wharton.upenn.edu/~buja/PAPERS/paper-proper-scoring.pdf.
[34]COSTA E P,LORENA A C,CARVALHO A C P L F,et al.A Review of Performance Evaluation Measures for Hierarchical Classifiers[C]//Proceedings of the AAAI 2007 Workshop Eva-luation Methods for Machine Learning.2007.
[35]DEMSAR J.Statistical Comparisons of Classifiers over Multiple Data Sets [J].Journal of Machine Learning Research,2007,7:1-30.
[36]FERRI C,FLACH P A,HERNANDEZ-ORALLO J.Improving the AUC of Probabilistic Estimation Trees[C]//14th European Conference on Machine Learning,Proceedings,Lecture Notes in Computer Science(ECML 2003).Springer,2003:121-132.
[37]ARIS F H,WENCESLAO G M.A Comparative Study of Me-thods for Testing the Equality of Two or More ROC Curves [J].Comput Stat,2018,33:357-377.
[38]MALACH T,POMENKOVA J.Comparing Classifier's Performance Based on Confidence Interval of the ROC[J].Radioen-gineering,2018,27(3):827-834.
[39]DAVIS J,GOADRICH M.The Relationship between Precision-recall and ROC Curves[C]//Proceedings of the 23rd International Conference on Machine Learning(ICML'06).2006:233-240.
[40]WU S,FLACH P A,FERRI C.An Improved Model Selection Heuristic for AUC[C]//18th European Conference on Machine Learning.2007:478-489.
[41]HUANG J,LING C X.Using AUC and Accuracy in Evaluating Learning Algorithms[J].IEEE Trans.Knowl.Data Eng.(TKDE),2005,17(3):299-310.
[42]CARUANA R,NICULESCU-MIZIL A.Data Mining in Metric Space:An Empirical Analysis of Supervised Learning Perfor-mance Criteria[C]//Proceedings of the 10^th ACM SIGKDD International Conference on Knowledge Discovery and Data Mi-ning (KDD).2004:69-78.
[43]FERRI C,FLACH P A,HERNANDEZ-ORALLO J ,et al.Modifying ROC Curves to Incorporate Predicted Probabilities[C]//Second Workshop on ROC Analysis in ML.2004:33-40.
[44]BENGIO Y,GR Y.No Unbiased Estimator of the Variance of K-Fold Cross-Validation [J].Journal of Machine Learning Research,2004,5:1089-1105.
[45]MARKATOU M,TIAN H,BISWAS S,et al.Analysis of Va-riance of Cross-Validation Estimators of the Generalization Error[J].Journal of Machine Learning Research,2005,6(1):1127-1168.
[46]MORENOTORRES J G,SAEZ J A,HERRERA F.Study on the Impact of Partition-induced Dataset Shift on k-fold Cross-validation [J].IEEE Transactions on Neural Networks & Learning Systems,2012,23(8):1304-1312.
[47]AKAIKE H.Information Theory and an Extension of the Maximum Likelihood Principle[M]//Breakthroughs in Statistics.New York:Springer,1992:610-624.
[48]SCHWARZ G.Estimating the Dimension of a Model [J].Annals of Statistics,1978,6(2):15-18.
[49]FAWCETT T.Using Rule Sets to Maximize ROC Performance[C]//IEEE International Conference on Data Mining.IEEE Computer Society,2001:131-138.
[50]HAND D J,TILL R J.A Simple Generalisation of the Area under the ROC Curve for Multiple Class Classification Problems [J].Machine Learning,2001,45(2):171-186.
[51]LOPEZ V,FERNANDEZ A,HERRERA F.On the Importance of the Validation Technique for Classification with Imbalanced Datasets:Addressing Covariate Shift When Data is Skewed [J].Information Sciences,2014,257(2):1-13.
[52]EVERITT B S.The Analysis of Contingency Tables [M].London:Chapman and Hall,1977.
[53]WANG R,WANG Y,LI J,et al.Block-Regularized m×2 Cross-validated Estimator of the Generalization Error [J].Neural Computation,2017,29(2):519-554.
[54]HUANG J,LU J,LING C X.Comparing Naive Bayes,Decision Trees,and SVM with AUC and Accuracy[C]//Proceedings of the Third IEEE International Conference on Data Mining (ICDM).IEEE Computer Society,2003:553-556.
[55]DUA D,KARRA TANISKIDOU E.UCI Machine Learning Repository[EB/OL].Irvine,CA:University of California,School of Information and Computer Science,2017.http://archive.ics.uci.edu/ml/index.php.
[56]PAUL H,KENTA N.A Probablistic Classification System for Predicting the Cellular Localization Sites of Proteins [J].Intelligent Systems in Molecular Biology,1996,4:109-115.
[57]EVETT I W,SPIEHLER E J.Rule Induction in ForensicScience [J].Knowledge Based Systems,1989,1:152-160.
[58]SIGILLITO V G,WING S P,HUTTON L V,et al.Classification of Radar Returns from the Ionosphere Using Neural Networks [J].Johns Hopkins APL Technical Digest,1989,10:262-266.

Related Articles 15

[1]	JIANG Sheng-teng, ZHANG Yi-chi, LUO Peng, LIU Yue-ling, CAO Kuo, ZHAO Hai-tao, WEI Ji-bo. Analysis of Performance Metrics of Semantic Communication Systems [J]. Computer Science, 2022, 49(7): 236-241.
[2]	DONG Dan-dan, SONG Kang. Performance Analysis on Reconfigurable Intelligent Surface Aided Two-way Internet of Things Communication System [J]. Computer Science, 2022, 49(6): 19-24.
[3]	LIN Li-xiang, LIU Xu-dong, LIU Shao-teng, XU Yue-dong. Survey on the Application of Forward Error Correction Coding in Network Transmission Protocols [J]. Computer Science, 2022, 49(2): 292-303.
[4]	JI Bao-feng, WANG Yi-dan, XING Bing-bing, LI Yu-qi, GAO Hong-feng, HAN Cong-cheng. Enhancement Method of Throughput in Ultra-dense Network Based on Hierarchical Multi-hop Physical Layer Network Coding [J]. Computer Science, 2019, 46(7): 56-60.
[5]	PENG Lei, ZANG Guo-zhen, GAO Yuan-yuan, SHA Nan, XI Chen-jing, JIANG Xuan-you. Research and Application of LMS Adaptive Interference Cancellation in Physical Layer SecurityCommunication System Based on Artifical Interference [J]. Computer Science, 2019, 46(6): 168-173.
[6]	CHENG Zhen, ZHAO Hui-ting, ZHANG Yi-ming, LIN Fei. Bit Error Rate Analysis of Diffusion-based Multicast Molecular Communication Networks [J]. Computer Science, 2019, 46(11): 80-87.
[7]	LU Ming-yue, GUO Dao-xing and NIU He-hao. New Physical Layer Network Coding Denoising Mapping Algorithm Based on MQAM [J]. Computer Science, 2017, 44(Z6): 284-287.
[8]	LIU Chun-ling and ZHANG Zi-hao. Performance Analysis of Beidou Receiver under Interference [J]. Computer Science, 2017, 44(2): 163-170.
[9]	DUANMU Chun-jiang and WANG Zhen-yu. New Optimal Power Allocation Scheme in HDAF Protocol under High Signal to Noise Ratio Conditions [J]. Computer Science, 2016, 43(11): 172-175.
[10]	WANG Pei-yan and CAI Dong-feng. Statistical Testing Based Research on Kernel Evaluation Measures [J]. Computer Science, 2015, 42(4): 199-205.
[11]	ZHANG Xiao-rong, WU Cheng-mao and LI Wen-xue. Method of Constructing Spread-spectrum Code Based on Chaos and Self-coded [J]. Computer Science, 2015, 42(3): 42-46.
[12]	. Estimation Method of Software Reliability for Safety-critical System [J]. Computer Science, 2011, 38(12): 135-138.
[13]	XIAO Fu,WANG Ru-chuan,SUN Li-juan,WANG Hua-shun. Research of TCP-friendly Congestion Control Protocol in Wireless Network [J]. Computer Science, 2010, 37(7): 50-53.
[14]	. [J]. Computer Science, 2009, 36(6): 108-111.
[15]	HUANG Juan, ZHANG Wei-qun ,WEN Xiao, LIANG Zhi-yuan （College of Computer and Information Science, Southwest University,Chongqing 400715,China）. [J]. Computer Science, 2009, 36(3): 277-280.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Survey for Performance Measure Index of Classification Learning Algorithm

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0