计算机科学 ›› 2020, Vol. 47 ›› Issue (6): 92-97.doi: 10.11896/jsjkx.190500074
邢毓华, 李明星
XING Yu-hua, LI Ming-xing
摘要: 随着大数据时代的来临,如何对海量高维数据进行有效的聚类分析并充分利用,已成为当下的热门研究课题。传统的聚类算法在处理高维数据时,聚类结果的精确度和稳定性较低,而子空间聚类算法通过分割原始数据的特征空间来得到不同的特征子集,可以大幅减小数据之间不相关特征对聚类结果的影响,挖掘出高维数据中不易展现的信息,在处理高维数据时具有显著的优势。针对现有基于图的子空间聚类算法在处理未知类型噪声以及复杂的凸问题时存在局限性的问题,在子空间聚类算法的基础上,结合空间投影理论,提出了一种基于投影的鲁棒低秩子空间聚类算法。首先对原始数据进行投影,利用编码消除投影空间的噪声,并对缺失的数据进行弥补;然后利用一种新的方法l2图来构造稀疏相似图;最后在l2图的基础上进行子空间聚类。该算法不需要对噪声的类型具有先验知识,且l2图能够很好地描述高维数据稀疏性和空间分散的特征。选取3种人脸数据集作为实验数据集,首先确定影响聚类效果的最优参数,然后从准确度、鲁棒性、时间复杂度3个方面对算法进行验证。实验结果表明,在3种人脸数据集中混入未知类型的噪声时,该算法具有较高的准确率和较低的时间复杂度,并且具有好的鲁棒性。
中图分类号:
[1]LU L.Combined central and subspace clustering for computer vision applications[C]//International Conference on Machine Learning.ACM,2006:593-600. [2]LEE M,CHO J,CHOI C H,et al.Procrustean Normal Distribution for Non-rigid Structure from Motion[C]//Computer Vision and Pattern Recognition.IEEE,2013:1280-1287. [3]BOULT T E,BROWN L G.Factorization-based segmentation of motions[C]//Proceedings of the IEEE Workshop on Visual Motion.IEEE,2002:179-186. [4]LIU D,JIANG M H,YANG X F,et al.Analyzing documents with Quantum Clustering:A novel pattern recognition algorithm based on quantum mechanics[J].Pattern Recognition Letters,2016,77:8-13. [5]LU S,CHANG D.An image segmentation method based on dynamic artificial fish swarm algorithm[C]//2012 IEEE 11th International Conference on Signal Processing.IEEE,2012:980-983. [6]ELHAMIFAR E,VIDAL R.Clustering disjoint subspaces via sparse representation[C]//IEEE International Conference on Acoustics Speech and Signal Processing.IEEE,2010:1926-1929. [7]WU X,CHEN X M,LI X,et al.Adaptive subspace learning:an iterative approach for document clustering[J].Neural Computing and Applications,2014,25(2):333-342. [8]BAI J C,LI J C,DAI P F,et al.General parameterized proximal point algorithm with applications in statistical learning[J].International Journal of Computer Mathematics,2019,96(1):199-215. [9]WU Y,ZHANG Z,HUANG T S,et al.Multibody Grouping via Orthogonal Subspace Decomposition[C]//Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR 2001).IEEE,2001:II-252-II-257 vol.2. [10]YAN J,POLLEFEYS M.A General Framework for Motion Segmentation:Independent,Articulated,Rigid,Non-rigid,Degenerate and Non-degenerate[C]//European Conference on Computer Vision.Springer Berlin Heidelberg,2006:94-106. [11]LU C Y,MIN H,ZHAO Z Q,et al.Robust and Efficient Subspace Segmentation via Least Squares Regression[C]//European Conference on Compater Vision.Berlin:Springer,2014:347-360. [12]LIU G,LIN Z,YAN S,et al.Robust recovery of subspace structures by low-rank representation[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2012,35(1):171-184. [13]ELHAMIFAR E,VIDAL R.Sparse Subspace Clustering:Algorithm,Theory,and Applications[J].IEEE transactions on pattern analysis and machine intelligence,2013,35(11):2765-2781. [14]CHEN J,ZHANG H,MAO H,et al.Symmetric low-rank representation for subspace clustering[J].Neurocomputing,2014,173(P3):1192-1202. [15]XIA C Q,HAN K,QI Y,et al.A Self-Training Subspace Clustering Algorithm under Low-Rank Representation for Cancer Classification on Gene Expression Data[J].IEEE/ACM Tran-sactions on Computational Biology and Bioinformatics,2018,15(4):1315-1324. [16]LUXBURG,ULRIKE.A tutorial on spectral clustering[J].Statistics & Computing,2007,17(4):395-416. [17]PAPADAKIS N,BUGEAU A.Tracking with occlusions via graph cuts[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2010,33(1):144-157. [18]HE X,YAN S,HU Y,et al.Face recognition using laplacian faces[C]//IEEE Transactions on Pattern Analysis and Machine Intelligence.2005:328-340. [19]WANG J,YANG J,YU K,et al.Locality-constrained linear coding for image classification[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.IEEE,2010:3360-3367. [20]FAVARO P,VIDAL R,RAVICHANDRAN A.A closed form solution to robust subspace estimation and clustering[C]//Computer Vision and Pattern Recognition.IEEE,2011:1801-1807. |
[1] | 周慧, 施皓晨, 屠要峰, 黄圣君. 基于主动采样的深度鲁棒神经网络学习 Robust Deep Neural Network Learning Based on Active Sampling 计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044 |
[2] | 徐鸣珂, 张帆. Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法 Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition 计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085 |
[3] | 唐超尘, 仇洪冰, 刘鑫, 唐清华. 非均匀白噪声条件下的相干MIMO雷达角度估计 Angle Estimation of Coherent MIMO Radar Under the Condition of Non-uniform Noise 计算机科学, 2022, 49(5): 262-265. https://doi.org/10.11896/jsjkx.210300162 |
[4] | 赵亮, 张洁, 陈志奎. 基于双图正则化的自适应多模态鲁棒特征学习 Adaptive Multimodal Robust Feature Learning Based on Dual Graph-regularization 计算机科学, 2022, 49(4): 124-133. https://doi.org/10.11896/jsjkx.210300078 |
[5] | 刘意, 毛莺池, 程杨堃, 高建, 王龙宝. 基于邻域一致性的异常检测序列集成方法 Locality and Consistency Based Sequential Ensemble Method for Outlier Detection 计算机科学, 2022, 49(1): 146-152. https://doi.org/10.11896/jsjkx.201000156 |
[6] | 郑建炜, 黄娟娟, 秦梦洁, 徐宏辉, 刘志. 基于非局部相似及加权截断核范数的高光谱图像去噪 Hyperspectral Image Denoising Based on Non-local Similarity and Weighted-truncated NuclearNorm 计算机科学, 2021, 48(9): 160-167. https://doi.org/10.11896/jsjkx.200600135 |
[7] | 陶星朋, 徐宏辉, 郑建炜, 陈婉君. 基于非凸低秩矩阵逼近和全变分正则化的高光谱图像去噪 Hyperspectral Image Denoising Based on Nonconvex Low Rank Matrix Approximation and TotalVariation Regularization 计算机科学, 2021, 48(8): 125-133. https://doi.org/10.11896/jsjkx.200400143 |
[8] | 赵敏, 刘惊雷. 基于高斯场和自适应图正则的半监督聚类 Semi-supervised Clustering Based on Gaussian Fields and Adaptive Graph Regularization 计算机科学, 2021, 48(7): 137-144. https://doi.org/10.11896/jsjkx.200800190 |
[9] | 周钢, 郭福亮. 基于特征选择的高维数据集成学习方法研究 Research on Ensemble Learning Method Based on Feature Selection for High-dimensional Data 计算机科学, 2021, 48(6A): 250-254. https://doi.org/10.11896/jsjkx.200700102 |
[10] | 黄雪冰, 魏佳艺, 沈文宇, 凌力. 基于自适应加权重复值滤波和同态滤波的MR图像增强 MR Image Enhancement Based on Adaptive Weighted Duplicate Filtering and Homomorphic Filtering 计算机科学, 2021, 48(6A): 21-27. https://doi.org/10.11896/jsjkx.200800183 |
[11] | 王中元, 刘惊雷. 基于二阶近邻的核子空间聚类 Kernel Subspace Clustering Based on Second-order Neighbors 计算机科学, 2021, 48(6): 86-95. https://doi.org/10.11896/jsjkx.200800180 |
[12] | 林云, 黄桢航, 高凡. 扩散式变阶数最大相关熵准则算法 Diffusion Variable Tap-length Maximum Correntropy Criterion Algorithm 计算机科学, 2021, 48(5): 263-269. https://doi.org/10.11896/jsjkx.200300043 |
[13] | 石琳姗, 马创, 杨云, 靳敏. 基于SSC-BP神经网络的异常检测算法 Anomaly Detection Algorithm Based on SSC-BP Neural Network 计算机科学, 2021, 48(12): 357-363. https://doi.org/10.11896/jsjkx.201000086 |
[14] | 滕俊元, 高猛, 郑小萌, 江云松. 噪声可容忍的软件缺陷预测特征选择方法 Noise Tolerable Feature Selection Method for Software Defect Prediction 计算机科学, 2021, 48(12): 131-139. https://doi.org/10.11896/jsjkx.201000168 |
[15] | 巫勇, 刘永坚, 唐瑭, 王洪林, 郑建成. 基于鲁棒低秩张量恢复的高光谱图像去噪 Hyperspectral Image Denoising Based on Robust Low Rank Tensor Restoration 计算机科学, 2021, 48(11A): 303-307. https://doi.org/10.11896/jsjkx.210200103 |
|