计算机科学 ›› 2023, Vol. 50 ›› Issue (7): 72-81.doi: 10.11896/jsjkx.221000143
梁云辉1,2, 甘舰文1,2, 陈艳3, 周芃4, 杜亮1,2
LIANG Yunhui1,2, GAN Jianwen1,2, CHEN Yan3, ZHOU Peng4, DU Liang1,2
摘要: 在许多数据分析任务中,经常会遇到高维数据。特征选择技术旨在从原始高维数据中找到最具代表性的特征,但由于缺乏类标签信息,相比有监督场景,在无监督学习场景中选择合适的特征困难得多。传统的无监督特征选择方法通常依据某些准则对样本的特征进行评分,在这个过程中样本是被无差别看待的。然而这样做并不能完全捕捉数据的内在结构,不同样本的重要性应该是有差异的,并且样本权重与特征权重之间存在一种对偶关系,它们会互相影响。为此,提出了一种基于对偶流形重排序的无监督特征选择算法(Unsupervised Feature Selection Algorithm based on Dual Manifold Re-Ranking,DMRR),分别构建不同的相似性矩阵来刻画样本与样本、特征与特征、样本与特征的流形结构,并结合样本与特征的初始得分进行流形上的重排序。将DMRR与3种原始无监督特征选择算法以及2种无监督特征选择后处理算法进行比较,实验结果表明样本重要性信息、样本与特征之间的对偶关系有助于实现更优的特征选择。
中图分类号:
[1]WU X,XU X,LIU J,et al.Supervised Feature Selection withOrthogonal Regression and Feature Weighting[J].IEEE Tran-sactions on Neural Networks and Learning Systems,2020,32(5):1831-1838. [2]ZHANG H,GONG M,NIE F,et al.Unified Dual-Label Semi-Supervised Learning with Top-k Feature Selection[J].Neurocomputing,2022,501:875-888. [3]CHANDRASHEKAR G,SAHIN F.A Survey on Feature Selection Methods[J].Computers & Electrical Engineering,2014,40(1):16-28. [4]ZHU P,HOU X,WANG Z,et al.Compactness Score:A Fast Filter Method for Unsupervised Feature Selection[J].arXiv:2201.13194,2022. [5]KOHAVI R,JOHN G H.Wrappers for Feature Subset Selec-tion[J].Artificial Intelligence,1997,97(1/2):273-324. [6]ZHAO H,DU L,WEI J,et al.Local Sensitive Dual Concept Factorization for Unsupervised Feature Selection[J].IEEE Access,2020,8:133128-133143. [7]ZHANG Y,ZHANG Z,LI S,et al.Unsupervised NonnegativeAdaptive Feature Extraction for Data Representation[J].IEEE Transactions on Knowledge and Data Engineering,2018,31(12):2423-2440. [8]LI J,CHENG K,WANG S,et al.Feature Selection:A Data Pers-pective[J].ACM Computing Surveys(CSUR),2017,50(6):1-45. [9]HE X,CAI D,NIYOGI P.Laplacian Score for Feature Selection[C]//Advances in Neural Information Processing Systems.2005:507-514. [10]YAO C,LIU Y F,JIANG B,et al.LLE Score:A New Filter-Based Unsupervised Feature Selection Method Based on Nonli-near Manifold Embedding and Its Application to Image Recognition[J].IEEE Transactions on Image Processing,2017,26(11):5257-5269. [11]LIM H,KIM D W.Pairwise Dependence-Based UnsupervisedFeature Selection[J].Pattern Recognition,2021,111:107663. [12]CAI D,ZHANG C,HE X.Unsupervised Feature Selection for Multi-Cluster Data[C]//Proceedings of The 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2010:333-342. [13]HE X,JI M,ZHANG C,et al.A Variance Minimization Crite-rion to Feature Selection Using Laplacian Regularization[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2011,33(10):2013-2025. [14]WANG D,NIE F,HUANG H.Feature Selection via Global Redundancy Minimization[J].IEEE Transactions on Knowledge and Data Engineering,2015,27(10):2743-2755. [15]NIE F,YANG S,ZHANG R,et al.A General Framework for Auto-Weighted Feature Selection via Global Redundancy Minimization[J].IEEE Transactions on Image Processing,2018,28(5):2428-2438. [16]KRIEGEL H P,KRÖGER P,SANDER J,et al.Density-Based Clustering[J].Wiley Interdisciplinary Reviews:Data Mining and Knowledge Discovery,2011,1(3):231-240. [17]LI C,WANG X,DONG W,et al.Joint Active Learning with Feature Selection via Cur Matrix Decomposition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,41(6):1382-1396. [18]XIA R,PAN Z,XU F.Instance Weighting for Domain Adaptation via Trading off Sample Selection Bias and Variance[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence.2018:13-19. [19]GU Q,ZHOU J.Co-Clustering on Manifolds[C]//Proceedings of The 15th ACM SIGKDD International Conference on Know-ledge Discovery and Data Mining.2009:359-368. [20]SONG K,YAO X,NIE F,et al.Weighted Bilateral K-means Algorithm for Fast Co-Clustering and Fast Spectral Clustering[J].Pattern Recognition,2021,109:107560. [21]LUO M,NIE F,CHANG X,et al.Adaptive Unsupervised Feature Selection with Structure Regularization[J].IEEE Transactions on Neural Networks and Learning Systems,2017,29(4):944-956. [22]LI J,CHEN J,QI F,et al.Two-Dimensional Unsupervised Feature Selection via Sparse Feature Filter[J/OL].IEEE Transactions on Cybernetics,2022.https://www.researchgate.net/publication/359887581_Two-Dimensional_Unsupervised_Feature_Selection_via_Sparse_Feature_Filter. [23]LI W,CHEN H,LI T,et al.Unsupervised Feature Selection via Self-Paced Learning and Low-Redundant Regularization[J].Knowledge-Based Systems,2022,240:108150. [24]WAHID A,KHAN D M,HUSSAIN I,et al.Unsupervised Feature Selection with Robust Data Reconstruction(UFS-RDR) and Outlier Detection[J].Expert Systems with Applications,2022,201:117008. [25]XIE J,WANG M,XU S,et al.The Unsupervised Feature Selection Algorithms Based on Standard Deviation and Cosine Similarity for Genomic Data Analysis[J].Frontiers in Genetics,2021,12:684100. [26]BAO F,LIN Y,LI Y,et al.An Online Multi Tag Stream Feature Selection Algorithm Combining Neighborhood Information and Tag Correlation [J].Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition),2023,35(1):79-89. [27]LV X L,DU L,ZHOU P,et al.Unsupervised Feature Selection Algorithm Framework Based on Neighborhood Interval Distur-bance Fusion [J].Journal of Nanjing University of Science and Technology,2021,45(4):420-428. [28]HAN K,WANG Y,ZHANG C,et al.Autoencoder Inspired Unsupervised Feature Selection[C]//2018 IEEE International Conference on Acoustics,Speech and Signal Processing(IC-ASSP).IEEE,2018:2941-2945. [29]BEIRANVAND F,MEHRDAD V,DOWLATSHAHI M B.Unsupervised Feature Selection for Image Classification:A Bipartite Matching-Based Principal Component Analysis Approach[J].Knowledge-Based Systems,2022,250:109085. [30]ZHAO H,LI Q,WANG Z,et al.Joint Adaptive Graph Learning and Discriminative Analysis for Unsupervised Feature Selection[J].Cognitive Computation,2022,14(3):1211-1221. [31]MIAO J,YANG T,SUN L,et al.Graph Regularized LocallyLinear Embedding for Unsupervised Feature Selection[J].Pattern Recognition,2022,122:108299. [32]MACQUEEN J.Classification and Analysis of Multivariate Observations[C]//5th Berkeley Symposium on Mathematical,Statistics and Probability.1967:281-297. [33]LI X,ZHANG H,WANG R,et al.Multiview Clustering:AScalable and Parameter-Free Bipartite Graph Fusion Method[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,44(1):330-344. |
|