Computer Science ›› 2019, Vol. 46 ›› Issue (1): 78-85.doi: 10.11896/j.issn.1002-137X.2019.01.012

Confidence Interval Method for Classification Usability Evaluation of Data Sets

TAN Xun-tao, GU Yi-yi, RUAN Tong, YUAN Yu-bo   

  1. (Department of Computer Science and Engineering,East China University of Science and Technology,Shanghai 200237,China)
  • Received:2018-06-08 Online:2019-01-15 Published:2019-02-25

Abstract: It is always a difficult problem to evaluate the usability of training data sets effectively,which hinders the application of intelligent classification systems.Aiming at the issue of data classification in the field of machine learning,based on interval analysis and information granulation,this paper proposed an evaluation method of data classification usability to measure the separability of data sets.In this method,dataset is defined as the classification information system,and the concept of classification confidence interval is put forward,then the information granulation is carried out by interval analysis.Under this information granulation strategy,this paper defined the mathematical model of classification usability,and further gave the calculation method of the classification usability for single attribute and the total data set.In this paper,18 UCI standard data sets were selected as evaluation objects,the evaluation results of classification usability were given,and 3 classifiers were selected to classify the above data sets.Finally,the effectiveness and feasibility of this evaluation method are verified by the analysis of experimental results.

Key words: Classification system, Classification usability, Data usability, Information granulation, Interval analysis

CLC Number: 

  • TP391
