Computer Science ›› 2023, Vol. 50 ›› Issue (7): 72-81.doi: 10.11896/jsjkx.221000143

• Database & Big Data & Data Science • Previous Articles     Next Articles

Unsupervised Feature Selection Algorithm Based on Dual Manifold Re-ranking

LIANG Yunhui1,2, GAN Jianwen1,2, CHEN Yan3, ZHOU Peng4, DU Liang1,2   

  1. 1 College of Computer and Information Technology,Shanxi University,Taiyuan 030006,China
    2 Institute of Big Data Science and Industry,Shanxi University,Taiyuan 030006,China
    3 College of Computer,Sichuan University,Chendu 610065,China
    4 College of Computer Science and Technology,Anhui University,Hefei 230601,China
  • Received:2022-10-18 Revised:2022-11-12 Online:2023-07-15 Published:2023-07-05
  • About author:LIANG Yunhui,born in 1998,postgra-duate,is a member of China Computer Federation.His main research interests include data mining and feature selection.DU Liang,born in 1985,Ph.D,associate professor,is a member of China Computer Federation.His main research interests include data mining,feature selection and clustering analysis.
  • Supported by:
    National Natural Science Foundation of China(61976129,62176001).

Abstract: High dimensional data is often encountered in many data analysis tasks.Feature selection techniques aim to find the most representative features from the original high-dimensional data.Due to the lack of class label information,it is much more difficult to select suitable features in unsupervised learning scenarios than in supervised scenarios.Traditional unsupervised feature selection methods usually score the features of samples according to certain criteria in which samples are treated indiscriminately.However,these approaches cannot capture the internal structure of data completely.The importance of different samples should vary.There is a dual relationship between weight of sample and feature that will influence each other.Therefore,an unsupervised feature selection algorithm based on dual manifold re-ranking(DMRR) is proposed in this paper.Different similarity matrices are constructed to depict the manifold structures on samples and samples,features and features,and samples and features respectively.Then manifold re-ranking is carried out by combining the initial scores of samples and features.By comparing DMRR with three original unsupervised feature selection algorithms and two unsupervised feature selection post-processing algorithms,experimental results verify that importance information of different samples and the dual relationship between sample and feature are helpful to achieve better feature selection.

Key words: Dual, Manifold learning, Re-ranking, Feature selection, Unsupervised learning

CLC Number: 

  • TP181
[1]WU X,XU X,LIU J,et al.Supervised Feature Selection withOrthogonal Regression and Feature Weighting[J].IEEE Tran-sactions on Neural Networks and Learning Systems,2020,32(5):1831-1838.
[2]ZHANG H,GONG M,NIE F,et al.Unified Dual-Label Semi-Supervised Learning with Top-k Feature Selection[J].Neurocomputing,2022,501:875-888.
[3]CHANDRASHEKAR G,SAHIN F.A Survey on Feature Selection Methods[J].Computers & Electrical Engineering,2014,40(1):16-28.
[4]ZHU P,HOU X,WANG Z,et al.Compactness Score:A Fast Filter Method for Unsupervised Feature Selection[J].arXiv:2201.13194,2022.
[5]KOHAVI R,JOHN G H.Wrappers for Feature Subset Selec-tion[J].Artificial Intelligence,1997,97(1/2):273-324.
[6]ZHAO H,DU L,WEI J,et al.Local Sensitive Dual Concept Factorization for Unsupervised Feature Selection[J].IEEE Access,2020,8:133128-133143.
[7]ZHANG Y,ZHANG Z,LI S,et al.Unsupervised NonnegativeAdaptive Feature Extraction for Data Representation[J].IEEE Transactions on Knowledge and Data Engineering,2018,31(12):2423-2440.
[8]LI J,CHENG K,WANG S,et al.Feature Selection:A Data Pers-pective[J].ACM Computing Surveys(CSUR),2017,50(6):1-45.
[9]HE X,CAI D,NIYOGI P.Laplacian Score for Feature Selection[C]//Advances in Neural Information Processing Systems.2005:507-514.
[10]YAO C,LIU Y F,JIANG B,et al.LLE Score:A New Filter-Based Unsupervised Feature Selection Method Based on Nonli-near Manifold Embedding and Its Application to Image Recognition[J].IEEE Transactions on Image Processing,2017,26(11):5257-5269.
[11]LIM H,KIM D W.Pairwise Dependence-Based UnsupervisedFeature Selection[J].Pattern Recognition,2021,111:107663.
[12]CAI D,ZHANG C,HE X.Unsupervised Feature Selection for Multi-Cluster Data[C]//Proceedings of The 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2010:333-342.
[13]HE X,JI M,ZHANG C,et al.A Variance Minimization Crite-rion to Feature Selection Using Laplacian Regularization[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2011,33(10):2013-2025.
[14]WANG D,NIE F,HUANG H.Feature Selection via Global Redundancy Minimization[J].IEEE Transactions on Knowledge and Data Engineering,2015,27(10):2743-2755.
[15]NIE F,YANG S,ZHANG R,et al.A General Framework for Auto-Weighted Feature Selection via Global Redundancy Minimization[J].IEEE Transactions on Image Processing,2018,28(5):2428-2438.
[16]KRIEGEL H P,KRÖGER P,SANDER J,et al.Density-Based Clustering[J].Wiley Interdisciplinary Reviews:Data Mining and Knowledge Discovery,2011,1(3):231-240.
[17]LI C,WANG X,DONG W,et al.Joint Active Learning with Feature Selection via Cur Matrix Decomposition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,41(6):1382-1396.
[18]XIA R,PAN Z,XU F.Instance Weighting for Domain Adaptation via Trading off Sample Selection Bias and Variance[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence.2018:13-19.
[19]GU Q,ZHOU J.Co-Clustering on Manifolds[C]//Proceedings of The 15th ACM SIGKDD International Conference on Know-ledge Discovery and Data Mining.2009:359-368.
[20]SONG K,YAO X,NIE F,et al.Weighted Bilateral K-means Algorithm for Fast Co-Clustering and Fast Spectral Clustering[J].Pattern Recognition,2021,109:107560.
[21]LUO M,NIE F,CHANG X,et al.Adaptive Unsupervised Feature Selection with Structure Regularization[J].IEEE Transactions on Neural Networks and Learning Systems,2017,29(4):944-956.
[22]LI J,CHEN J,QI F,et al.Two-Dimensional Unsupervised Feature Selection via Sparse Feature Filter[J/OL].IEEE Transactions on Cybernetics,2022.https://www.researchgate.net/publication/359887581_Two-Dimensional_Unsupervised_Feature_Selection_via_Sparse_Feature_Filter.
[23]LI W,CHEN H,LI T,et al.Unsupervised Feature Selection via Self-Paced Learning and Low-Redundant Regularization[J].Knowledge-Based Systems,2022,240:108150.
[24]WAHID A,KHAN D M,HUSSAIN I,et al.Unsupervised Feature Selection with Robust Data Reconstruction(UFS-RDR) and Outlier Detection[J].Expert Systems with Applications,2022,201:117008.
[25]XIE J,WANG M,XU S,et al.The Unsupervised Feature Selection Algorithms Based on Standard Deviation and Cosine Similarity for Genomic Data Analysis[J].Frontiers in Genetics,2021,12:684100.
[26]BAO F,LIN Y,LI Y,et al.An Online Multi Tag Stream Feature Selection Algorithm Combining Neighborhood Information and Tag Correlation [J].Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition),2023,35(1):79-89.
[27]LV X L,DU L,ZHOU P,et al.Unsupervised Feature Selection Algorithm Framework Based on Neighborhood Interval Distur-bance Fusion [J].Journal of Nanjing University of Science and Technology,2021,45(4):420-428.
[28]HAN K,WANG Y,ZHANG C,et al.Autoencoder Inspired Unsupervised Feature Selection[C]//2018 IEEE International Conference on Acoustics,Speech and Signal Processing(IC-ASSP).IEEE,2018:2941-2945.
[29]BEIRANVAND F,MEHRDAD V,DOWLATSHAHI M B.Unsupervised Feature Selection for Image Classification:A Bipartite Matching-Based Principal Component Analysis Approach[J].Knowledge-Based Systems,2022,250:109085.
[30]ZHAO H,LI Q,WANG Z,et al.Joint Adaptive Graph Learning and Discriminative Analysis for Unsupervised Feature Selection[J].Cognitive Computation,2022,14(3):1211-1221.
[31]MIAO J,YANG T,SUN L,et al.Graph Regularized LocallyLinear Embedding for Unsupervised Feature Selection[J].Pattern Recognition,2022,122:108299.
[32]MACQUEEN J.Classification and Analysis of Multivariate Observations[C]//5th Berkeley Symposium on Mathematical,Statistics and Probability.1967:281-297.
[33]LI X,ZHANG H,WANG R,et al.Multiview Clustering:AScalable and Parameter-Free Bipartite Graph Fusion Method[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,44(1):330-344.
[1] HUANG Yuhang, SONG You, WANG Baohui. Improved Forest Optimization Feature Selection Algorithm for Credit Evaluation [J]. Computer Science, 2023, 50(6A): 220600241-6.
[2] LEI Ying, LIU Feng. LDM-EEG:A Lightweight EEG Emotion Recognition Method Based on Dual-stream Structure Scaling and Multiple Attention Mechanisms [J]. Computer Science, 2023, 50(6A): 220300262-9.
[3] LIANG Meiyan, ZHANG Yu, LIANG Jianan, CHEN Qinghui, WANG Ru, WANG Lin. Pathological Image Super-resolution Reconstruction Based on Sparse Coding Non-local AttentionDual Network [J]. Computer Science, 2023, 50(6A): 220700016-8.
[4] YU Jiuyang, ZHANG Dean, DAI Yaonan, HU Tianhao, XIA Wenfeng. Image Super-resolution Reconstruction Based on Structured Fusion Attention Network [J]. Computer Science, 2023, 50(6A): 220600240-5.
[5] LI Fan, JIA Dongli, YAO Yumin, TU Jun. Graph Neural Network Few Shot Image Classification Network Based on Residual and Self-attention Mechanism [J]. Computer Science, 2023, 50(6A): 220500104-5.
[6] ZHANG Changfan, MA Yuanyuan, LIU Jianhua, HE Jing. Dual Gating-Residual Feature Fusion for Image-Text Cross-modal Retrieval [J]. Computer Science, 2023, 50(6A): 220700030-7.
[7] LIU Jin, MI Jusheng, LI Zhongling, LI Meizheng. Dual Three-way Concept Lattice Based on Composition of Concepts and Its Concept Reduction [J]. Computer Science, 2023, 50(6): 122-130.
[8] YIN Xingzi, PENG Ningning, ZHAN Xueyan. Filtered Feature Selection Algorithm Based on Persistent Homology [J]. Computer Science, 2023, 50(6): 159-166.
[9] WANG Xianwang, ZHOU Hao, ZHANG Minghui, ZHU Youwei. Hyperspectral Image Classification Based on Swin Transformer and 3D Residual Multilayer Fusion Network [J]. Computer Science, 2023, 50(5): 155-160.
[10] JIA Tianhao, PENG Li. SSD Object Detection Algorithm with Residual Learning and Cyclic Attention [J]. Computer Science, 2023, 50(5): 170-176.
[11] SUN Lin, LI Mengmeng, XU Jiucheng. Binary Harris Hawk Optimization and Its Feature Selection Algorithm [J]. Computer Science, 2023, 50(5): 277-291.
[12] YIN Haitao, WANG Tianyou. Image Denoising Algorithm Based on Deep Multi-scale Convolution Sparse Coding [J]. Computer Science, 2023, 50(4): 133-140.
[13] CHEN Jinjie, HE Chao, XIAO Xiao, LEI Yinjie. Optical Performance Monitoring Method Based on Fine-grained Constellation Diagram Recognition [J]. Computer Science, 2023, 50(4): 220-225.
[14] HE Chao, CHEN Jinjie, JIN Zhao, LEI Yinjie. Automatic Modulation Recognition Method Based on Multimodal Time-Frequency Feature Fusion [J]. Computer Science, 2023, 50(4): 226-232.
[15] CHEN Yijun, GAO Haoran, DING Zhijun. Credit Evaluation Model Based on Dynamic Machine Learning [J]. Computer Science, 2023, 50(1): 59-68.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!