Computer Science ›› 2011, Vol. 38 ›› Issue (5): 149-153.
Previous Articles Next Articles
CHEN Huan,HUANG Der-cai
Online:
Published:
Abstract: Missing data arc inevitable in data-collection, how to restore these data has become one of the hottest issues in data mining. Just like most algorithms,missing data imputation algorithms based on Mahalanobis Distance make full use of relationships between data. I}hough the results arc acceptable, the covariance matrixes arc not always reversible, which limit the algorithms greatly. This paper improved a traditional principal component analysis(PCA) method, proposed a new distance named Generalized Mahalanobis Distance according to SVl)and Moore-Penrose pseudoinverse. Combining with SOFM neural network and entropy, we designed GS missing data imputation algorithms. After academic analysis and simulation, it was proved that Generalized Mahalanobis Distance inherits the advantages of Mahalanobis Distance wonderfully in dealing with relatived data. Not only the new algorithm has good accuracy and stability, but also suits for any datascts.
Key words: PCA, Moore-penrose pseudoinverse, Generalized mahalanobis distance, SOFM neural network, Entropy
CHEN Huan,HUANG Der-cai. Missing Data Imputation Based on Generalized Mahalanobis Distance[J].Computer Science, 2011, 38(5): 149-153.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: https://www.jsjkx.com/EN/
https://www.jsjkx.com/EN/Y2011/V38/I5/149
Cited