计算机科学 ›› 2020, Vol. 47 ›› Issue (7): 66-70.doi: 10.11896/jsjkx.190600155
高方远1, 王秀美2
GAO Fang-yuan1, WANG Xiu-mei2
摘要: 聚类分析是机器学习与数据挖掘中的重要工具,而子空间聚类是高维数据分析中常用的聚类方法。基于谱图的子空间聚类方法首先学习数据在子空间中的自表示系数矩阵,然后基于此进行谱聚类分析。通过研究子空间聚类的过程和模型设计,发现基于子空间的聚类方法存在难以保持数据非线性和局部几何结构的问题。为此,文中提出了一种可以提取非线性结构的子空间聚类方法。首先,使用非线性映射函数将原始数据空间映射到高维的线性空间,利用块对角表示保持子空间的独立性。此外,为了对聚类过程中数据的局部结构进行约束,文中引入了基于拉普拉斯矩阵的流形正则项。然后,采用3种计算拉普拉斯矩阵的方法设计不同的基于流形正则和块对角约束的非线性子空间聚类模型。最后,在不同数据集上的实验结果验证了所提算法的有效性。
中图分类号:
[1]MAC Q J.Some methods for classification and analysis of multivariate observations[C]//Proceedings of Berkeley Symposium on Mathematical Statistics and Probability.1967:281-297. [2]NG A Y,JORDAN M I,WEISS Y.On spectral clustering:analysis and an algorithm[C]//Proceedings of the International Conference on Neural Information Processing Systems.Cambridge:MIT Press,2001:849-856. [3]VIDAL R.Subspace clustering[J].IEEE Signal ProcessingMagazine,2011,28(2):52-68. [4]FISCHLER M,BOLLES R R.Random sample consensus:a para-digm for model fitting with applications to image analysis and automated cartography[J].Journal of ACM,1981,24(6):381-395. [5]MA Y,DERKSEN H,HONG W,et al.Segmentation of multivariate mixed data via lossy coding and compression[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,29(9):1546-1562. [6]LU L,VIDAL R.Combined central and subspace clustering for computer vision applications[C]//Proceedings of International Conference on Machine Learning.New York:ACM Press,2006:593-600. [7]VIDAL R,MA Y,SASTRY S.Generalized principal component analysis[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(12):1-15. [8]ELHAMIFAR E,VIDAL R.Sparse subspace clustering[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2009:2790-2797. [9]LIU G,LIN Z,YAN S,et al.Robust recovery of subspace structures by low-rank representation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(1):171-184. [10]GAO X,ZHANG K,TAO D,et al.Image super-resolution with sparse neighbor embedding[J].IEEE Transactions on Image Processing,2012,21(7):3194-3205. [11]LU C,FENG J,LIN Z,et al.Subspace clustering by block diagonal representation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,41(2):487-501. [12]SHI J,MALIK J.Normalized cuts and image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(8):888-905. [13]ZELNIK-MANOR L,PERONA P.Self-tuning spectral cluste-ring[C]//Proceedings of the International Conference on Neural Information Processing Systems.Cambridge:MIT Press,2004:1601-1608. [14]LIU X Y,LI J W,YU H,et al.Adaptive spectral clustering based on shared nearest neighbors[J].Journal of Chinese Computer Systems,2011,32:1876-1880. [15]TOLIC'D,ANTULOV-FANTULIN N,KOPRIVA I.A nonli-near orthogonal non-negative matrix factorization approach to subspace clustering[J].Pattern Recognition,2018,82:40-55. [16]SORENSEN D C,ANTOULAS A C.The sylvester equation and approximate balanced reduction[J].Linear Algebra and its Applications,2002,351/352:671-700. [17]https://archive.ics.uci.edu/ml/datasets.html. [18]LEE K C,HO J,KRIEGMAN D J.Acquiring linear subspaces for face recognition under variable lighting[J].IEEE Transactions on Pattern Recognition and Machine Intelligence,2005,27(5):684-698. [19]LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-basedlearning applied to document recognition[J].Proceeding of IEEE,1998,86(11):2278-2324. [20]WANG F,ZHAO B,ZHANG C.Linear time maximum margin clustering[J].IEEE Transactions on Neural Networks,2010,21(2):319-332. |
[1] | 杨蕾, 降爱莲, 强彦. 基于自编码器和流形正则的结构保持无监督特征选择 Structure Preserving Unsupervised Feature Selection Based on Autoencoder and Manifold Regularization 计算机科学, 2021, 48(8): 53-59. https://doi.org/10.11896/jsjkx.200700211 |
[2] | 王中元, 刘惊雷. 基于二阶近邻的核子空间聚类 Kernel Subspace Clustering Based on Second-order Neighbors 计算机科学, 2021, 48(6): 86-95. https://doi.org/10.11896/jsjkx.200800180 |
[3] | 石琳姗, 马创, 杨云, 靳敏. 基于SSC-BP神经网络的异常检测算法 Anomaly Detection Algorithm Based on SSC-BP Neural Network 计算机科学, 2021, 48(12): 357-363. https://doi.org/10.11896/jsjkx.201000086 |
[4] | 邢毓华, 李明星. 基于投影的鲁棒低秩子空间聚类算法 Robust Low Rank Subspace Clustering Algorithm Based on Projection 计算机科学, 2020, 47(6): 92-97. https://doi.org/10.11896/jsjkx.190500074 |
[5] | 刘淑君, 魏莱. 基于分块集成的图像聚类算法 Block Integration Based Image Clustering Algorithm 计算机科学, 2020, 47(6): 170-175. https://doi.org/10.11896/jsjkx.190400052 |
[6] | 黄梦婷, 张灵, 姜文超. 基于流形正则化的多类型关系数据联合聚类方法 Multi-type Relational Data Co-clustering Approach Based on Manifold Regularization 计算机科学, 2019, 46(6): 64-68. https://doi.org/10.11896/j.issn.1002-137X.2019.06.008 |
[7] | 张恒巍,何嘉婧,韩继红,王晋东. 基于智能优化算法的模糊软子空间聚类方法 Fuzzy Soft Subspace Clustering Method Based on Intelligent Optimization Algorithm 计算机科学, 2016, 43(3): 256-261. https://doi.org/10.11896/j.issn.1002-137X.2016.03.047 |
[8] | 陈丽萍,郭躬德. 一种基于顺序特性的子空间聚类方法 Subspace Clustering Based on Sequential Feature 计算机科学, 2016, 43(3): 72-74. https://doi.org/10.11896/j.issn.1002-137X.2016.03.014 |
[9] | 许学研,王苏南,吴春明. 基于子空间聚类算法的流量分类方法研究 Network Traffic Classification Method Research Based on Subspace Clustering Algorithm 计算机科学, 2014, 41(Z11): 301-306. |
[10] | 姜伟,陈耀,杨炳儒. 基于流形正则化的非光滑非负矩阵分解 Manifold Regularized-based Nonsmooth Nonnegative Matrix Factorization 计算机科学, 2014, 41(3): 272-275. |
[11] | 王丽娟,郝志峰,蔡瑞初,温雯. 基于多扰动的局部自适应软子空间聚类融合算法 Multiple Local Adaptive Soft Subspace Clustering Ensemble Based on Multimodal Perturbation 计算机科学, 2014, 41(2): 240-244. |
[12] | 徐海瑞,张文生,吴双. 基于流形正则化的文档分类算法研究 Document Classification Algorithm Based on Manifold Regularization 计算机科学, 2012, 39(3): 196-199. |
[13] | 孙玉芬 卢炎生. 一种基于网格方法的高维数据流子空间聚类算法 计算机科学, 2007, 34(4): 199-203. |
[14] | . 高维Turnstile型数据流聚类算法 计算机科学, 2006, 33(11): 14-17. |
|