Computer Science ›› 2021, Vol. 48 ›› Issue (8): 53-59.doi: 10.11896/jsjkx.200700211

• Database & Big Data & Data Science • Previous Articles     Next Articles

Structure Preserving Unsupervised Feature Selection Based on Autoencoder and Manifold Regularization

YANG Lei, JIANG Ai-lian, QIANG Yan   

  1. College of Information and Computer,Taiyuan University of Technology,Jinzhong,Shanxi 030600,China
  • Received:2020-07-31 Revised:2020-09-03 Published:2021-08-10
  • About author:YANG Lei,born in 1996,postgraduate.Her main research interests include machine learning and feature selection.(yanglei_l@163.com)JIANG Ai-lian,born in 1969,Ph.D, associate professor,is a member of China Computer Federation.Her main research interests include big data analysis and processing,feature selection,artificial intelligence and computer vision.
  • Supported by:
    National Natural Science Foundation of China(61872261) and Scientific Research Funding Project for Returned Overseas Scholars in Shanxi Province(2017-051).

Abstract: There are a lot of redundant and irrelevant features in high-dimensional data,which seriously affect the efficiency and quality of data mining and the generalization performance of machine learning.Therefore,feature selection has become an important research direction in the computer field.In this paper,an unsupervised feature selection algorithm is proposed by using the non-linear learning ability of the autoencoder.First,based on the reconstruction error of the autoencoder,a single feature is selec-ted which is important for data reconstruction.Second,the feature weights finally select the feature subsets that contribute greatly to the reconstruction of other features.Manifold learning is introduced to capture the local and non-local structure of the original data space,and L2/1 sparse regularization is added to the feature weights to improve the sparsity of the feature weights so that they can select more distinctive features.Finally,a new objective function is constructed,and a gradient descent algorithm is used to optimize the proposed objective function.Experiments on six different types of typical data sets,and the proposed algorithm is compared with five commonly used unsupervised feature selection algorithms.Experiment results verify that the proposed algorithm can effectively select important features,significantly improve the classification accuracy rate and clustering accuracy rate.

Key words: Autoencoder, Feature selection, Manifold regularization, Structure preservation, Subspace learning

CLC Number: 

  • TP181
[1]DPATIL M,SANE S S.Dimension Reduction:A Review[J].International Journal of Computer Applications,1999,92(16):23-29.
[2]DY J G,BRODLEY C E,KAK A,et al.Unsupervised feature selection applied to content-based retrieval of lung images[C]//IEEE Trans.Pattern Anal.Mach.Intell.,2003(25):373-378.
[3]TANG J,LIU H.Unsupervised feature selection for linked social media data[C]//Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mi-ning.2012:904-912.
[4]CAI D,ZHANG C,HE X.Unsupervised feature selection formulti-cluster data[C]//Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2010:333-342.
[5]DY J G,BRODLEY C E.Feature Selection for UnsupervisedLearning[C]//International Conference on Neural Information Processing.2012:845-889.
[6]ZHU P,ZUO W,ZHANG L,et al.Unsupervised feature selection by regularized self-representation[J].Pattern Recognition,2015,48(2):438-446.
[7]WANG W,ZHANG H,ZHU P,et al.Non-convex Regularized Self-representation for Unsupervised Feature Selection[M]//Intelligence Science and Big Data Engineering.Big Data and Machine Learning Techniques.Springer International Publishing,2015.
[8]TANG C,LIU X,LI M,et al.Robust unsupervised feature selection via dual self-representation and manifold regularization[J].Knowledge-Based Systems,2018,145(1):109-120.
[9]LI Y,LEI C,FANG Y,et al.Unsupervised feature selection by combining subspace learning with feature self-representation[J].Pattern Recognition Letters,2017,109(15):35-43.
[10]WANG Z Y,JIANG A L,et al.Unsupervised feature selection method based on regularized mutual representation[J].Chinese Journal of Computer Applications,2020,40(7):1896-1900.
[11]HAN K,WANG Y,ZHANG C,et al.Autoencoder InspiredUnsupervised Feature Selection[C]//International Conference on Acoustics,Speech and Signal Processing (ICASSP).2017:2941-2945.
[12]FENG S W,DUARTE M F.Graph autoencoder-based unsupervised feature selection with broad and local data structure pre-servation[J].Neurocomputing,2018,312(27):310-323.
[13]TAHERKHANI A,COSMA G,MCGINNITY T M.Deep-FS:A feature selection algorithm for Deep Boltzmann Machines[J].Neurocomputing,2018,322(17):22-37.
[14]SHARIFIPOUR S,FAYYAZI H.Unsupervised feature selection ranking and selection based on autoencoders[C]//IEEE,ICASSP.2019.
[15]CHANG T,MEIRU B,LIU X W.Unsupervised feature selection via latent representation learning and manifold regularization[J].Neural Networks,2019,117:163-178.
[16]LIU X,WANG L,ZHANG J,et al.Global and Local Structure Preservation for Feature Selection[J].IEEE Transactions on Neural Networks & Learning Systems,2014,25(6):1083-1095.
[17]HE X,CAI D,NIYOGI P.Laplacian score for feature selection[C]//Advances in Neural Information Processing Systems.2006:507-514.
[18]CAI D,ZHANG C Y,HE X F.Unsupervised feature selection for Multi-Cluster data[C]//Acm Sigkdd International Confe-rence on Knowledge Discovery & Data Mining.ACM,2010.
[19]NIE F,ZHU W,LI X.Unsupervised Feature Selection withStructured Graph Optimization[C]//Thirtieth AAAI Confe-rence on Artificial Intelligence.AAAI Press,2016.
[20]WANG S,TANG J,LIU H.Embedded Unsupervised Feature Selection[C]//Proceedings of the Twenty-Ninth AAAI Confe-rence on Arttificial Intelligence.2015:470-476.
[21]ZHOU N,XU Y,CHENG H,et al.Global and local structure preserving sparse subspace learning:An iterative approach to unsupervised feature selection[J].Pattern Recognition,2016,53:87-101.
[22]YU J.Manifold regularized stacked denoising autoencoders with feature selection[J].Neurocomputing,2019,358(17):235-245.
[23]HU R,ZHU X,CHENG D,et al.Graph self-representationmethod for unsupervised feature selection[J].Neurocomputing,2017,220(12):130-137.
[24]GLOROT X,BENGIO Y.Understanding the difficulty of trai-ning deep feedforward neural networks[J].Journal of Machine Learning Research,2010,9:249-256.
[1] WANG Guan-yu, ZHONG Ting, FENG Yu, ZHOU Fan. Collaborative Filtering Recommendation Method Based on Vector Quantization Coding [J]. Computer Science, 2022, 49(9): 48-54.
[2] LIU Xin, WANG Jun, SONG Qiao-feng, LIU Jia-hao. Collaborative Multicast Proactive Caching Scheme Based on AAE [J]. Computer Science, 2022, 49(9): 260-267.
[3] LI Bin, WAN Yuan. Unsupervised Multi-view Feature Selection Based on Similarity Matrix Learning and Matrix Alignment [J]. Computer Science, 2022, 49(8): 86-96.
[4] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[5] KANG Yan, WANG Hai-ning, TAO Liu, YANG Hai-xiao, YANG Xue-kun, WANG Fei, LI Hao. Hybrid Improved Flower Pollination Algorithm and Gray Wolf Algorithm for Feature Selection [J]. Computer Science, 2022, 49(6A): 125-132.
[6] CHU An-qi, DING Zhi-jun. Application of Gray Wolf Optimization Algorithm on Synchronous Processing of Sample Equalization and Feature Selection in Credit Evaluation [J]. Computer Science, 2022, 49(4): 134-139.
[7] SUN Lin, HUANG Miao-miao, XU Jiu-cheng. Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief [J]. Computer Science, 2022, 49(4): 152-160.
[8] HAN Jie, CHEN Jun-fen, LI Yan, ZHAN Ze-cong. Self-supervised Deep Clustering Algorithm Based on Self-attention [J]. Computer Science, 2022, 49(3): 134-143.
[9] LI Zong-ran, CHEN XIU-Hong, LU Yun, SHAO Zheng-yi. Robust Joint Sparse Uncorrelated Regression [J]. Computer Science, 2022, 49(2): 191-197.
[10] QIAO Jie, CAI Rui-chu, HAO Zhi-feng. Mining Causality via Information Bottleneck [J]. Computer Science, 2022, 49(2): 198-203.
[11] ZHANG Ye, LI Zhi-hua, WANG Chang-jie. Kernel Density Estimation-based Lightweight IoT Anomaly Traffic Detection Method [J]. Computer Science, 2021, 48(9): 337-344.
[12] ZHANG Shi-peng, LI Yong-zhong. Intrusion Detection Method Based on Denoising Autoencoder and Three-way Decisions [J]. Computer Science, 2021, 48(9): 345-351.
[13] XU Tao, TIAN Chong-yang, LIU Cai-hua. Deep Learning for Abnormal Crowd Behavior Detection:A Review [J]. Computer Science, 2021, 48(9): 125-134.
[14] HOU Chun-ping, ZHAO Chun-yue, WANG Zhi-peng. Video Abnormal Event Detection Algorithm Based on Self-feedback Optimal Subclass Mining [J]. Computer Science, 2021, 48(7): 199-205.
[15] HU Yan-mei, YANG Bo, DUO Bin. Logistic Regression with Regularization Based on Network Structure [J]. Computer Science, 2021, 48(7): 281-291.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!