Computer Science ›› 2018, Vol. 45 ›› Issue (6A): 16-21.

• Review • Previous Articles     Next Articles

Research and Development of Feature Dimensionality Reduction

HUANG Xuan   

  1. School of Information Science and Technology,Southwest Jiaotong University,Chengdu 610031,China
  • Online:2018-06-20 Published:2018-08-03

Abstract: Quality of data characteristics directly impacts the accuracy of the model.In the field of pattern recognition,dimensionality reduction technique is always the focus of researchers.At the era of big data,massive data needs to be processed while the dimension of the data is rising.The performance of the traditional methods of data mining is degradedor losing efficiency for processing high dimensional data.Studies show that dimensionality reduction technology can be implemented to effectively avoid the “Curse of Dimensionality” in data analysis,thus it has wild application.This paper gave detailed description about two dimensionality reduction methods which are feature selection and feature extraction,in addition,a thoroughly comparison about the feature of these two methods was performed.Feature selection algorithm was summarized and analyzed by two key steps of algorithm,which are searching strategy and evaluation criterion.Finally,the direction for future research of the dimensionality reduction was discussed based on its practical application.

Key words: Dimensionality reduction, Feature extraction, Feature selection, Research progress

CLC Number: 

  • TP391
[1]SHEIK A.A Survey on Evolutionary Techniques for Feature Selection[C]∥IEEE Conference on Emerging Devices and Smart Systems.Tiruchengode India:IEEE Press,2017.<br /> [2]SAMINA K,TEHMINA K.A Survey of Feature Selection and Feature Extraction Techniques in Machine Learning [C]∥Scien-ce and Information Conference.London:IEEE Press,2014:372-378.<br /> [3]JOLLIFFE I T.Principal component analysis[M].Berlin: Springer-Verlag,1986.<br /> [4]DUDA R O,HART P E,STORK D G.Pattern Classification(2nd Edition) ∥En Broeck the Statistical Mechanics of Learning Rsity.2000:32-39.<br /> [5]COMON P.Independent component analysis,a new concept [J].Signal Processing,1994,36(3):287-314.<br /> [6]BRONSTEIN A M,BRONSTEIN M M,KIMMEL R.Genera- lized multidimensional scaling:a framework for isometry-inva-riant partial surface matching [J].Proceedings of the National Academy of Sciences of the United States of America,2006,103(5):1168-1172.<br /> [7]WANG J Y.Geometric structure of high-dimensional data and dimensionality reduction[M].New York:Springer Heidelberg,2011:131-147.<br /> [8]SCHLKOPF B,SMOLA A,MULLER K R.Nonlinear Component Analysis as a Kernel Eigenvalue Problem [J].Neural Computation,1998,10(5):1299-1319.<br /> [9]MIKA S,R TSCH E,WESTON J,et al.Fisher Discriminant Analysis with Kernels ∥Proceedings of IEEE Workshop Neural Networks for Signal Processing.1999:41-48.<br /> [10]WEINBERGER K Q,SAUL L K.Unsupervised learning of ima- ge manifolds by semidefinite programming .International Journal of Computer Vision,2006,70(1):77-90.<br /> [11]TENENBAUM J B,SILVA V,UNGFORD J C.A global geometric framework for nonlinear dimensionality reduction [J].Science,2000,290(12):2319-2323.<br /> [12]ROWEIS S T,SAUL L K.Nonlinear dimensionality reduction by locally linear embedding [J].Science,2000,290(5500):2323-2326.<br /> [13]BELKIN M.Problems of learning on manifolds[D].Chicago: The University of Chicago,2003.<br /> [14]HE X F,NIYOGI P.Locality preserving projections[C]∥Advances in Neural Information Processing Systems 16.Vancouver,Canada:MIT Press,2003:153.<br /> [15]DONOHO D L,GRIMES C.Hessian Eigenmaps:New Locally Linear Embedding Techniques for High-dimensional Data .Proceedings of the National Academy of Sciences of the Unite States of America,2003,100(10):5591-5596.<br /> [16]MOALLEN P,AYOUGHI S A.Removing potential flat spots on error surface of multilayer perceptron (MLP) neural networks [J].International Journal of Computer Mathematics,2011,88(1/3):21-36.<br /> [17]JUNCHIN A,ANDRI M.Supervised,Unsupervised,and Semi-Supervised Feature Selection:A Review on Gene Selection [J].Transactions on Computational Biology and Bioinformatics,2016,13(5):971-989.<br /> [18]SUN Z H,GEORGE B,RONALD M.Object detection using feature subset selection [J].Pattern Recognition,2004,37(11):2165-2176.<br /> [19]CAI Z Y,YU J G,LI X P,et al.Feature selection algorithm based on kernel distance measure[J].Pattern Recognition and Artificial Intelligence,2010,23(2):235-240.<br /> [20]PUDIL P,NOVOVICOVA J,KITTLER J.Floating Search Me- thods in Feature Selection[J].Pattern Recognition Letters,1994,15(11):1119-1125.<br /> [21]LIU H,YU L.Toward integrating feature selection algorithms for classification and clustering .IEEE Transactions on Knowledge and Data Engineering,2005,17(4):491-502.<br /> [22] KOLLER D,SAHAMI M.Toward optimal feature selection∥ Thirteenth International Conference on International Conference on Machine Learning.Morgan Kaufmann Publishers Inc.,1996:284-292.<br /> [23]MITRA P,MURTHY C A,SANKAR K P.Unsupervised feature selection using feature similarity .IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(3):301-312.<br /> [24]GUYON I,WESTON J,BARNHILL S,et al.Gene selection for cancer classification using support vector machines [J].Machine Learning,2002,46(1):389-422.<br /> [25]YANG J B,ONG C J.Feature selection for support vector regression using probabilistic prediction[C]∥16 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2010:343-352.<br /> [26]SHEN K Q,CHONG C J,LI X P,et al.Feature selection via sensitivity analysis of SVM probabilistic outputs[J].Machine Learning,2008,70(1):1-20.<br /> [27]FORMAN G.An extensive empirical study of feature selection metrics for text classification [J].Journal of Machine Learning Research,2003,3:1289-1305.<br /> [28]NG A Y.Feature selection,L1 vs. L2 regularization, and rotational invariance ∥Proceedings of the Twenty-first International Conference on Machine Learning.New York:ACM,2004:78.<br /> [29]MANGASARIAN O L,WILD E W.Feature Selection for Nonlinear Kernel Support Vector Machines [C]∥Seventh IEEE International Conference on Data Mining-workshops.2007:231-236.<br /> [30]WANG L F,SHEN X T.Multi-category support vector ma- chines,feature selection and solution path.Statistica Sinica,2006,16(2):617- 633.<br /> [31]LEUNG Y,HUNG Y.A multiple-filter-multiple-wrapper ap- proach to gene selection and microarray data classification. IEEE/ACM Transactions on Computational Biology and Bioinformatics,2010,7(1):108-117.<br /> [32]LAZAR C,TAMINAU J,MEGANCK S,et al.A survey on filter techniques for feature selection in gene microarray analysis .IEEE/ACM Transactions on computational Biology and Bioinformatics,2012,9(4):1106-1119.<br /> [33]SHEN Q,DIAO R,SU P.Feature Selection Ensemble∥ Turing.2012:289-306.<br /> [34]LI G Z,YANG J Y.Feature selection for ensemble learning and its application∥Machine Learning in Bioinformatics.2008:135-155.<br /> [35]PENG Y H,WU Z Q,JIANG J M.A novel feature selection approach for biomedical data classification .Journal of Biomedi-cal Informatics,2010,43(1):15-23.<br /> [36]CHIN A J,MIRZAL A,et al.Supervised Unsupervised,and Semi-Supervised Feature Selection:A Review on Gene Selection[J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2016,13(5):971-989.<br /> [37]OPITZ D W.Feature Selection for Ensembles∥Proceedings of National Conference on Artificial Intelligence.Orlando,FL,1999:379-384.<br /> [38]ABEEL T,HELLEPUTTE T,VAN D P Y,et al.Robust biomarker identification for cancer diagnosis with ensemble feature selection methods .IEEE/ACM Transactions on computational Biology and Bioinformatics,2010,26(3):392-398.<br /> [39]WONG H S,ZHANG S,SHEN Y,et al.A New Unsupervised Feature Ranking Method for Gene Expression Data Based on Consensus Affinity.IEEE/ACM Transactions on Computational Biology & Bioinformatics,2012,9(4):1257-1263.<br /> [40]张靖,胡学钢,张玉红,等.K-split Lasso:有效的肿瘤特征基因选择方法.计算机科学与探索,2012,6(12):1136-1143.<br /> [41] JIN L L,LIANG H.Deep Learning for Underwater Image Re- cognition in Small Sample Size Situations [C]∥IEEE Conference on Oceans.Aberdeen UK:IEEE Press,2017.<br /> [42]HINTON G.Reducing the Dimensionality of Data with Neural Networks [J].Science,2016,313(5786):504-507.<br /> [43]孙志远,鲁成祥,史忠植,等.深度学习研究与进展.计算机科学,2016,43(2):1-8.
[1] LI Bin, WAN Yuan. Unsupervised Multi-view Feature Selection Based on Similarity Matrix Learning and Matrix Alignment [J]. Computer Science, 2022, 49(8): 86-96.
[2] ZHANG Yuan, KANG Le, GONG Zhao-hui, ZHANG Zhi-hong. Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM [J]. Computer Science, 2022, 49(7): 31-39.
[3] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[4] ZENG Zhi-xian, CAO Jian-jun, WENG Nian-feng, JIANG Guo-quan, XU Bin. Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism [J]. Computer Science, 2022, 49(7): 106-112.
[5] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[6] LIU Wei-ye, LU Hui-min, LI Yu-peng, MA Ning. Survey on Finger Vein Recognition Research [J]. Computer Science, 2022, 49(6A): 1-11.
[7] KANG Yan, WANG Hai-ning, TAO Liu, YANG Hai-xiao, YANG Xue-kun, WANG Fei, LI Hao. Hybrid Improved Flower Pollination Algorithm and Gray Wolf Algorithm for Feature Selection [J]. Computer Science, 2022, 49(6A): 125-132.
[8] GAO Yuan-hao, LUO Xiao-qing, ZHANG Zhan-cheng. Infrared and Visible Image Fusion Based on Feature Separation [J]. Computer Science, 2022, 49(5): 58-63.
[9] YANG Hui, TAO Li-hong, ZHU Jian-yong, NIE Fei-ping. Fast Unsupervised Graph Embedding Based on Anchors [J]. Computer Science, 2022, 49(4): 116-123.
[10] CHU An-qi, DING Zhi-jun. Application of Gray Wolf Optimization Algorithm on Synchronous Processing of Sample Equalization and Feature Selection in Credit Evaluation [J]. Computer Science, 2022, 49(4): 134-139.
[11] SUN Lin, HUANG Miao-miao, XU Jiu-cheng. Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief [J]. Computer Science, 2022, 49(4): 152-160.
[12] ZUO Jie-ge, LIU Xiao-ming, CAI Bing. Outdoor Image Weather Recognition Based on Image Blocks and Feature Fusion [J]. Computer Science, 2022, 49(3): 197-203.
[13] LI Zong-ran, CHEN XIU-Hong, LU Yun, SHAO Zheng-yi. Robust Joint Sparse Uncorrelated Regression [J]. Computer Science, 2022, 49(2): 191-197.
[14] REN Shou-peng, LI Jin, WANG Jing-ru, YUE Kun. Ensemble Regression Decision Trees-based lncRNA-disease Association Prediction [J]. Computer Science, 2022, 49(2): 265-271.
[15] ZHANG Ye, LI Zhi-hua, WANG Chang-jie. Kernel Density Estimation-based Lightweight IoT Anomaly Traffic Detection Method [J]. Computer Science, 2021, 48(9): 337-344.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!