基于核函数的稀疏属性选择算法

doi:10.11896/j.issn.1002-137X.2019.02.010

Abstract

Abstract: In view of the condition that the traditional feature selection algorithm can not capture the relationship between features,a nonlinear feature selection method was proposed.By introducing a kernel function,the method projects the original data set into a high-dimensional kernel space,and considers the relationship between sample features by performing operations in the kernel space.Due to the superiority of the kernel function,even if the data are projected into the infinite dimensional space through the Gaussian kernel,the computational complexity can be controlled to a small extent.For the limitation of the regularization factor,the use of two norms for double constraint not only improves the accuracy of the algorithm,but also makes the variance of the algorithm only be 0.74,which is much smaller than other similar comparison algorithms,and it is more stable.6 similar algorithms were compared on 8 common data sets,and the SVM classifier was used to test the effect.The results demonstrate that the proposed algorithm can get the improvement by a minimum of 1.84%,a maximum of 3.27%,and an average of 2.75%.

Key words: L₁-norm, L_2,1-norm, Feature selection, Kernel function, Sparse

CLC Number:

TP181

ZHANG Shan-wen, WEN Guo-qiu, ZHANG Le-yuan, LI Jia-ye. Sparse Feature Selection Algorithm Based on Kernel Function[J].Computer Science, 2019, 46(2): 62-67.

References

[1]ZHU X,SUK H I,SHEN D.Matrix-Similarity Based Loss Function and Feature Selection for Alzheimer’s Disease Diagnosis[C]∥Computer Vision and Pattern Recognition.IEEE,2014:3089-3096.
[2]GU Q,LI Z,HAN J.Joint feature selection and subspace lear- ning[C]∥International Joint Conference on Artificial Intelligence.AAAI Press,2011:1294-1299.
[3]ZHU X,HUANG Z,CHENG H,et al.Sparse hashing for fast multimedia search[J].Acm Transactions on Information Systems,2013,31(2):1-24.
[4]ZHU X,HUANG Z,YANG Y,et al.Self-taught dimensionality reduction on the high-dimensional small-sized data[J].Pattern Recognition,2013,46(1):215-229.
[5]PYATYKH S,HESSER J,ZHENG L.Image noise level estima- tion by principal component analysis[J].IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society,2013,22(2):687-699.
[6]KONIETSCHKE F,PAULY M.Bootstrapping and permuting paired t-test type statistics[J].Statistics & Computing,2014,24(3):283-296.
[7]LIIMATAINEN K,HEIKKILÄ R,YLIHARJA O,et al.Sparse logistic regression and polynomial modelling for detection of artificial drainage networks[J].Remote Sensing Letters,2015,6(4):311-320.
[8]BENABDESLEM K,HINDAWI M.Constrained laplacian score for semi-supervised feature selection[C]∥Machine Learning and Knowledge Discovery in Databases－European Conference Proceedings.DBLP,2011:204-218.
[9]ZHANG S,CHENG D,ZONG M,et al.Self-representation nearest neighbor search for classification[J].Neurocomputing,2016,195(C):137-142.
[10]DENG Z,ZHANG S,YANG L,et al.Sparse sample self-representation for subspace clustering[J].Neural Computing & Applications,2018,29(11):43-49.
[11]VARMA M,BABU B R.More generality in efficient multiple kernel learning[C]∥International Conference on Machine Learning.ACM,2009:1065-1072.
[12]COMANICIU D,RAMESH V,MEER P P.Kernel-Based Object Tracking[J].Pattern Analysis & Machine Intelligence,2003,25(5):564-575.
[13]GONG Y H,ZONG M,ZHU Y H,et al.Knn regression based on mixed-norm reconstruction [J].Computer Applications and Software,2016(2):232-236.(in Chinese)
龚永红,宗鸣,朱永华,等.基于混合模重构的kNN回归[J].计算机应用与软件,2016(2):232-236.
[14]WANG H,NIE F,HUANG H,et al.Sparse multi-task regression and feature selection to identify brain imaging predictors for memory performance[C]∥International Conference on Compu-ter Vision.2011:557.
[15]GU Q,LI Z,HAN J.Linear discriminant dimensionality reduction[C]∥Joint European Conference on Machine Learning and Knowledge Discovery in Databases.Springer Berlin Heidelberg,2011:549-564.
[16]ZHU X,ZHANG L,HUANG Z.A sparse embedding and least variance encoding approach to hashing[J].IEEE Transactions on Image Processing,2014,23(9):3737-3750.
[17]ZHU X,SUK H I,SHEN D.A Novel Multi-relation Regularization Method for Regression and Classification in AD Diagnosis[C]∥International Conference on Medical Image Computing and Computer-Assisted Intervention.Springer International Publishing,2014:401-408.
[18]UCI repository of machine learning datasets [EB/OL].
[2016-05-27].http://archive.icsuci.edu/ml.
[19]NIE F,HUANG H,CAI X,et al.Efficient and robust feature selection via joint  2,1 -norms minimization[C]∥International Conference on Neural Information Processing Systems.Curran Associates Inc.,2010:1813-1821.
[20]CHANG X,NIE F,YANG Y,et al.A convex formulation for semi-supervised multi-label feature selection[C]∥Twenty-Eighth AAAI Conference on Artificial Intelligence.AAAI Press,2014:1171-1177.
[21]CAI D,ZHANG C,HE X.Unsupervised feature selection for multi-cluster data[C]∥ACM SIGKDD International Confe-rence on Knowledge Discovery and Data Mining.ACM,2010:333-342.
[22]YAMADA M,JITKRITTUM W,SIGAL L,et al.High-Dimensional Feature Selection by Feature-Wise Non-Linear Lasso[J].Neural Computation,2012,26(1):185-207.
[23]NIE F,ZHU W,LI X.Unsupervised feature selection with structured graph optimization[C]∥Thirtieth AAAI Conference on Artificial Intelligence.AAAI Press,2016:1302-1308.
[24]YANG Y,SHEN H T,MA Z,et al.l 2,1 -norm regularized discriminative feature selection for unsupervised learning[C]∥International Joint Conference on Artificial Intelligence.AAAI Press,2011:1589-1594.
[25]LIBSVM-ALibrary for Support Vector Machinces [EB/OL].
[2015-04-10].http://www/csie.ntu.edu.tw/~cjlin/libsvm.
[26]ZHAO Z,HE X,CAI D,et al.Graph Regularized Feature Selection with Data Reconstruction[J].IEEE Transactions on Knowledge & Data Engineering,2016,28(3):689-700.
[27]XUE H,SONG Y,XU H M.Multiple Indefinite Kernel Lear- ning for Feature Selection[C]∥Twenty-Sixth International Joint Conference on Artificial Intelligence.2017:3210-3216.

Related Articles 15

[1]	LI Bin, WAN Yuan. Unsupervised Multi-view Feature Selection Based on Similarity Matrix Learning and Matrix Alignment [J]. Computer Science, 2022, 49(8): 86-96.
[2]	HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[3]	KANG Yan, WANG Hai-ning, TAO Liu, YANG Hai-xiao, YANG Xue-kun, WANG Fei, LI Hao. Hybrid Improved Flower Pollination Algorithm and Gray Wolf Algorithm for Feature Selection [J]. Computer Science, 2022, 49(6A): 125-132.
[4]	LIU Yun, DONG Shou-jie. Acceleration Algorithm of Multi-channel Video Image Stitching Based on CUDA Kernel Function [J]. Computer Science, 2022, 49(6A): 441-446.
[5]	WANG Jin, LIU Jiang. GPU-based Parallel DILU Preconditioning Technique [J]. Computer Science, 2022, 49(6): 108-118.
[6]	CHU An-qi, DING Zhi-jun. Application of Gray Wolf Optimization Algorithm on Synchronous Processing of Sample Equalization and Feature Selection in Credit Evaluation [J]. Computer Science, 2022, 49(4): 134-139.
[7]	SUN Lin, HUANG Miao-miao, XU Jiu-cheng. Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief [J]. Computer Science, 2022, 49(4): 152-160.
[8]	LI Zong-ran, CHEN XIU-Hong, LU Yun, SHAO Zheng-yi. Robust Joint Sparse Uncorrelated Regression [J]. Computer Science, 2022, 49(2): 191-197.
[9]	ZHANG Cheng-rui, CHEN Jun-jie, GUO Hao. Comparative Analysis of Robustness of Resting Human Brain Functional Hypernetwork Model [J]. Computer Science, 2022, 49(2): 241-247.
[10]	ZHANG Ye, LI Zhi-hua, WANG Chang-jie. Kernel Density Estimation-based Lightweight IoT Anomaly Traffic Detection Method [J]. Computer Science, 2021, 48(9): 337-344.
[11]	YANG Lei, JIANG Ai-lian, QIANG Yan. Structure Preserving Unsupervised Feature Selection Based on Autoencoder and Manifold Regularization [J]. Computer Science, 2021, 48(8): 53-59.
[12]	SUN Sheng-zi, GUO Bing-hui , YANG Xiao-bo. Embedding Consensus Autoencoder for Cross-modal Semantic Analysis [J]. Computer Science, 2021, 48(7): 93-98.
[13]	HOU Chun-ping, ZHAO Chun-yue, WANG Zhi-peng. Video Abnormal Event Detection Algorithm Based on Self-feedback Optimal Subclass Mining [J]. Computer Science, 2021, 48(7): 199-205.
[14]	HU Yan-mei, YANG Bo, DUO Bin. Logistic Regression with Regularization Based on Network Structure [J]. Computer Science, 2021, 48(7): 281-291.
[15]	SUN Ming-wei, SI Wei-chao, DONG Qi. Research on Comprehensive Evaluation of Network Quality of Service Based on Multidimensional Data [J]. Computer Science, 2021, 48(6A): 246-249.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Sparse Feature Selection Algorithm Based on Kernel Function

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0