计算机科学 ›› 2021, Vol. 48 ›› Issue (6A): 220-225.doi: 10.11896/jsjkx.210100167
李鹏, 刘力军, 黄永东
LI Peng, LIU Li-jun, HUANG Yong-dong
摘要: 经典的谱聚类算法包含两个步骤。(1)谱嵌入过程:求解Laplacian矩阵的特征值分解,得到分类指示矩阵的连续松弛解。(2)后处理过程:对谱嵌入连续松弛矩阵应用k-means或者谱旋转,得到最终的二值指示矩阵。由于有用信息的丢失,这种单独求解步骤不能保证最佳聚类结果。同时,谱聚类算法在处理大规模数据集时,存在聚类精度低、数据相似度矩阵存储开销大和 Laplacian 矩阵特征值分解计算复杂度高的问题。已有的联合谱聚类算法使用标准正交矩阵逼近非标准正交簇指示矩阵,这会导致较大的逼近误差。为了克服这一缺点,提出用一个改进的标准正交簇指示矩阵代替非正交指示矩阵,得到一个新的联合谱嵌入和谱旋转的谱聚类算法。因为两个标准正交矩阵更容易最小化,所以提出的算法可以取得更好的性能。进一步通过地标点方法对原始数据集进行稀疏特征表示,提出一种基于地标表示的联合谱嵌入和谱旋转算法(LJSESR),解决了大规模数据谱聚类的高效求解问题。实验结果表明,提出的LJSESR 算法具有可行性和有效性。
中图分类号:
[1] JAIN A K,MURTY M N,FLYNN P J.Data clustering:a review[J].ACM Computing Surveys,1999,31(3):264-323. [2] MACQUEEN J.Some methods for classification and analysis of multivariate observations[J].Berkeley Symposium on Mathematical Statistics and Probability,1967,1(5):281-297. [3] FU G Y.Optimization methods for fuzzy clustering[J].Fuzzy Sets and Systems,1998,93(3):301-309. [4] KAUFMAN L,ROUSSEEUW P J.Finding Groups in Data:An Introduction to Cluster Analysis[J].John Wiley & Sons,New York,USA,1990. [5] LUXBURG U V.A tutorial on spectral clustering[J].Statistics and Computing,2007,17(4):395-416. [6] JIN J G.Review of Clustering Method[J].Computer Science,2014,41(S2):288-293. [7] LI P Q,LI Y D,DENG X L,et al.Spectral Clustering Algorithm Based on SimRank Score[J].Computer Science,2018,45(S2):458-461. [8] LI X Y,GUO L J.Constructing affinity matrix in spectral clustering based on neighbor propagation[J].Neurocomputing,2012,97:125-130. [9] TAO X M,WANG R T,CHANG R,et al.Low Density Separation Density Sensitive Distance-based Spectral Clustering Algorithm[J].Acta Automatica Sinica,2020,46(7):1479-1495. [10] CHEN J F,ZHANG M,HE Q.Heuristically Determining Cluster Numbers Based NJW Spectral Clustering Algorithm[J].Computer Science,2018,45(S2):474-479. [11] ZHANG R G,YAO X L,ZHAO J,et al.Manifold Spectral Clustering Image Segmentation Algorithm Based on Local Geometry Features[J].Pattern Recognition and Artificial Intelligence,2020,33(4):313-324. [12] CHARLESS F,SERGE B,FAN C,et al.Spectral grouping using the Nyström method[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2004,26(2):214-225. [13] QIU Y F,LIU C.Spectral Clustering Algorithm Based onWeighted Ensemble Nyström Sampling[J].Pattern Recognition and Artificial Intelligence,2019,32(5):420-428. [14] CHEN X L,CAI D.Large scale spectral clustering with landmark-based representation[C]//AAAI Conference on Artificial Intelligence.San Francisco,USA,2011:313-318. [15] CAI D,CHEN X L.Large scale spectral clustering via landmark-based sparse representation[J].IEEE Transactions on Cybernetics,2015,45(8):1669-1680. [16] YE M,LIU W F.Large Scale Spectral Clustering Based on Fast Landmark Sampling[J].Journal of Electronics & Information Technology,2017,39(2):278-284. [17] ZHANG X C,ZONG L L,YOU Q Z,et al.Sampling forNyström extension based spectral clustering:incremental perspective and novel analysis[J].ACM Transactions on Know-ledge Discovery from Data,2016,11(1):1-25. [18] CHEN X J,NIE F P,HUANG Z X,et al.Scalable normalized cut with improved spectral rotation[C]//Twenty-Sixth International Joint Conference on Artificial Intelligence.2017:1518-1524. [19] CHU D R,ZHOU Z P.An autoencoder spectral clustering algorithm for improving landmark representation by weighted PageRank[J].CAAI Transactions on Intelligent Systems,2020,15(2):302-309. [20] ZHANG M,ZHOU Z P.An autoencoder-based spectral clustering algorithm combined with metric fusion and landmark representation[J].CAAI Transactions on Intelligent Systems,2020,15(4):687-696. [21] YU S X,SHI J B.Multiclass spectral clustering[C]//Proceedings of the 9th IEEE International Conference on Computer Vision(ICCV'03).USA,2003:313-319. [22] HUANG J,NIE F P,HUANG H.Spectral rotation versus k-means in spectral clustering[C]//Proceedings of the AAAI Confe rence on Artificial Intelligence(AAAI'13).AAAI Press,2013:431-437. [23] YANG Y,SHEN F M,HUANG Z,et al.A unified framework for discretespectral clustering[C]//International Joint Confe-rence on Artificial Intelligence.2016:2273-2279. [24] PANG Y W,XIE J,NIE F P,et al.Spectral clustering by joint spectral embedding and spectral rotation[J].IEEE Transactions on Cybernetics,2020,50(1):247-258. [25] ZHU J T,JANG J L,LIU T,et al.Joint spectral clustering based on optimal graph and feature selection[J].Neural Processing Letters,2020(11):1-17, [26] ZHOU L,PING X J,XU S,et al.Cluster Ensemble Based on Spectral Clustering[J].Acta Automatica Sinica,2012,38(8):1335-1342. [27] STOER M,WAGNER F.A simple min-cut algorithm[J].Journal of the ACM,1997,44(4):585-591. [28] SHI J B,JITENDRA.Normalized cuts and image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,11(9):888-905. [29] NG A Y,JORDAN M I,WEISS Y.On spectral clustering:Analysis and an algorithm[C]//Proceedings of the 14th International Conference on Neural Information Processing Systems:Natural and Synthetic.Cambridge,MA,USA,2002:849-856. [30] LIU W,HE J F,CHANG S F.Large graph construction forscalable semi-supervised learning[C]//Proceeding,Twenty-Se-venth International Conference on Machine Learning.Haifa,Israel,2010:679-686. [31] JOURNÉE M,NESTEROV Y,RICHTÁRIK P,et al.Genera-lized power method for sparse principal component analysis[J].Journal of Machine Learning Research,2010,11(2):517-553. [32] DHEERU D,TANISKIDOU E K.UCI Machine Learning Repository[EB/OL].http://archive.ics.uci.edu/ml. |
[1] | 郭奕杉, 刘漫丹. 基于时空轨迹数据的异常检测 Anomaly Detection Based on Spatial-temporal Trajectory Data 计算机科学, 2021, 48(6A): 213-219. https://doi.org/10.11896/jsjkx.201100193 |
[2] | 张晓琴, 安晓丹, 曹付元. 基于谱聚类的二分网络社区发现算法 Detecting Community from Bipartite Network Based on Spectral Clustering 计算机科学, 2019, 46(4): 216-221. https://doi.org/10.11896/j.issn.1002-137X.2019.04.034 |
[3] | 李长镜,赵书良,池云仙. 一种基于谱嵌入和局部密度的离群点检测算法 Outlier Detection Algorithm Based on Spectral Embedding and Local Density 计算机科学, 2019, 46(3): 260-266. https://doi.org/10.11896/j.issn.1002-137X.2019.03.039 |
[4] | 刘树栋, 魏嘉敏. 基于谱聚类和成对数据表示的多层感知机分类算法 Multilayer Perceptron Classification Algorithm Based on Spectral Clusteringand Simultaneous Two Sample Representation 计算机科学, 2019, 46(11A): 194-198. |
[5] | 王颖,杨余旺. 基于堆和邻域共存信息的KNN相似图算法 KNN Similarity Graph Algorithm Based on Heap and Neighborhood Coexistence 计算机科学, 2018, 45(5): 196-200. https://doi.org/10.11896/j.issn.1002-137X.2018.05.033 |
[6] | 王亮,田萱. 单幅散焦图像的局部特征模糊分割算法 Local Feature Fuzzy Segmentation Algorithm for Single Defocused Image 计算机科学, 2018, 45(2): 318-321. https://doi.org/10.11896/j.issn.1002-137X.2018.02.055 |
[7] | 常家伟, 戴牡红. 基于PageRank和谱方法的个性化推荐算法 Personalized Recommendation Algorithm Based on PageRank and Spectral Method 计算机科学, 2018, 45(11A): 398-401. |
[8] | 李鹏清, 李扬定, 邓雪莲, 李永钢, 方月. 一种基于SimRank得分的谱聚类算法 Spectral Clustering Algorithm Based on SimRank Score 计算机科学, 2018, 45(11A): 458-461. |
[9] | 陈俊芬, 张明, 何强. 基于启发式确定类数的NJW谱聚类算法 Heuristically Determining Cluster Numbers Based NJW Spectral Clustering Algorithm 计算机科学, 2018, 45(11A): 474-479. |
[10] | 王平心,刘强,杨习贝,米据生. 基于动态邻域的三支聚类分析 Three-way Clustering Analysis Based on Dynamic Neighborhood 计算机科学, 2018, 45(1): 62-66. https://doi.org/10.11896/j.issn.1002-137X.2018.01.009 |
[11] | 李金泽,徐喜荣,潘子琦,李晓杰. 改进的自适应谱聚类NJW算法 Improved Adaptive Spectral Clustering NJW Algorithm 计算机科学, 2017, 44(Z6): 424-427. https://doi.org/10.11896/j.issn.1002-137X.2017.6A.095 |
[12] | 李效伦,丁志军. LBSNs中的群体行程推荐方法 Group Travel Trip Recommendation Method in LBSNs 计算机科学, 2017, 44(6): 199-205. https://doi.org/10.11896/j.issn.1002-137X.2017.06.033 |
[13] | 柯祖福,易本顺,谢秋莹. 基于非局部自相似性的谱聚类图像去噪算法 Image Denoising Method of Spectrum Clustering Based on Non-local Similarity 计算机科学, 2017, 44(5): 299-303. https://doi.org/10.11896/j.issn.1002-137X.2017.05.055 |
[14] | 覃晓,梁伟,元昌安,唐涛. 基于遗传优化谱聚类的图形分割方法 Image Segmentation Algorithm of Spectral Clustering Optimized by Genetic 计算机科学, 2017, 44(1): 100-102. https://doi.org/10.11896/j.issn.1002-137X.2017.01.019 |
[15] | 范菁,阮体洪,吴佳敏,董天阳. 基于二次谱聚类和HMM-RF混合模型的车辆行为识别方法研究 Vehicle Behavior Recognition Method Based on Quadratic Spectral Clustering and HMM-RF Hybrid Model 计算机科学, 2016, 43(5): 288-293. https://doi.org/10.11896/j.issn.1002-137X.2016.05.055 |
|