Computer Science ›› 2019, Vol. 46 ›› Issue (6A): 423-426.

• Big Data & Data Mining •

### Linear Discriminant Analysis of High-dimensional Data Using Random Matrix Theory

LIU Peng, YE Bin

1. School of Information and Control Engineering,China University of Mining and Technology,Xuzhou,Jiangsu 221116,China
• Online:2019-06-14 Published:2019-07-02

Abstract: Linear discriminant analysis (LDA) is an important theoretical and analytic tool for many machine learning and data mining tasks.As a parametric classification method,it performs well in many applications.However,LDA is impractical for high-dimensional data sets which are now routinely generated everywhere in modern society.A primary reason for the inefficiency of LDA for high-dimensional data is that the sample covariance matrix is no longer a good estimator of the population covariance matrix when the dimension of feature vector is close to or even larger than the sample size.Therefore,this paper proposed a high-dimensional data classifier regularization method based on random matrix theory.Firstly,a truly consistent estimation was conducted for high-dimensional covariance matrix through rotation invariance estimation and eigenvalue interception.Secondely,the estimated high-dimensional covariance matrix was used to calculate the discrimination function value.Numerical experiments on the artificial datasets,as well as some real world datasets such as the microarray datasets,demonstrate that the proposed discriminant analysis method has wider applications and yields higher accuracies than existing competitors.

CLC Number:

• TP181
