计算机科学 ›› 2025, Vol. 52 ›› Issue (10): 79-89.doi: 10.11896/jsjkx.240800102
原泽菲, 张正军, 姜国林
YUAN Zefei, ZHANG Zhengjun, JIANG Guolin
摘要: 针对以高斯核函数为相似性度量的传统谱聚类算法需人为设置尺度参数,相似度与样本分布结构无关的问题,定义了在自然k近邻基础上的共享邻居,结合数据点的近邻信息构造了能反映区域密度的多尺度参数,以新的尺度参数重新定义了相似性度量,提出了一种基于相对邻近度的自适应谱聚类算法(Adaptive Spectral Clustering based on Relative Proximity,RPASC)。改进的尺度参数结合了间隔尺度、顺序尺度及比例尺度等特性,体现了数据点之间的相对位置关系,反映了不同密度簇的分布特征和空间结构,提高了算法对不同分布数据集的适应性。新的相似性度量通过灵活调整局部尺度参数的大小,自适应地缩小不同密度簇边界上数据点的相似度,使聚类的簇边界更明确,有利于发现真实的簇形态。通过在人工合成数据集和UCI真实数据集上进行的实验,验证了RPASC算法在多个聚类性能指标上的有效性。
中图分类号:
[1]ZHANG Y L,ZHOU Y J.Review of clustering algorithms [J].Journal of Computer Applications,2019,39(7):1869-1882. [2]SHI J,MALIK J.Normalized Cuts and Image Segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(8):888-905. [3]LI P,LIU L J,HUANG Y D.Landmark-based Spectral Clustering by Joint Spectral Embedding and Spectral Rotation[J].Computer Science,2021,48(S1):220-225. [4]XU X,ZHANG H,YANG C M,et al.Fair Method for Spectral Clustering to Improve Intra-cluster Fairness[J].Computer Science,2023,50(2):158-165. [5]LI L L.A survey of spectral clustering algorithms and their applications[J].Software Guide,2016,15(7):54-56. [6]LIAO L C,JIANG X H,ZOU F M,et al.A Spectral Clustering Method for Big Trajectory Data Mining with Latent Semantic Correlation[J].Acta Automatica Sinica,2015,43(5):956-964. [7]HE L,LI Y,ZHANG X,et al.Incremental spectral clusteringvia fastfood features and its application to stream image segmentation[J].Symmetry,2018,10(7):272. [8]ALSHAMMARI M,TAKATSUKA M.Approximate spectralclustering with eigenvector selection and self-tuned k[J].Pattern Recognition Letters,2019,122:31-37. [9]XIA K J,GU X Q,ZHANG Y D.Oriented groupingconstrained spectral clustering for medical imaging segmentation[J].Multimedia Systems,2020,26(1):27-36. [10]XU D H,LI C,CHEN T,et al.A novel low rank spectral clustering method for face identification[J].Recent Patents on Engineering,2019,13(4):387-394. [11]WANG L,BO L F,JIAO L C.Density-Sensitive Semi-Super-vised Spectral Clustering[J].Journal of Software,2007,18(10):2412-2422. [12]TAO X M,WANG R T,CHANG R,et al.Low Density Separation Density Sensitive Distance-based Spectral Clustering Algorithm[J].Acta Automatica Sinica,2020,46(7):1479-1495. [13]MANOR L Z,PERONA P.Self-Tuning Spectral Clustering[C]//Proceeding of NIPS.2005:1601-1608. [14]KONG W Z,SUN C S H,ZHANG J H,et al.Spectral clustering based onneighboringadaptive local scale[J].Journal of Image and Graphics,2012,17(4):523-529. [15]ZHANG X,LI J,YU H.Local density adaptive similarity measurement for spectral clustering[J].Pattern Recognition Letters,2011,32(2):352-358. [16]ZHAO X Q,LIU X L.Improved adaptive spectral clustering algorithm based on density sensitivity[J].Journal of Lanzhou University of Technology,2018,44(6):102-106. [17]ZHANG T,GE H W.Spectral Clustering Based on Density Coefficient and Shared Nearest Neighbors[J].Journal of Chinese Computer Systems,2017,38(8):1829-1833. [18]ZHAO Y L,CHE W G,JIN R Z.A self-adaptive spectral clustering algorithm based on an improved coefficient of variation between samples[J].Journal of Lanzhou University(Natural Sciences),2022,58(6):812-818. [19]GE J W,YANG G X.Spectral Clustering Algorithm for Density Adaptive Neighborhood Based on Shared Nearest Neighbors[J].Computer Engineering,2021,47(8):116-123. [20]BAI L,ZHAO X,KONG Y T,et al.Survey of Spectral Cluste-ring Algorithms[J].Computer Engineering and Applications,2021,57(14):15-26. [21]REBAGLIATI N,VERRI A.Spectral clustering with more than K eigenvectors[J].Neurocomputing,2011,74(9):1391-1401. [22]STEVENS S S.Mathematics,measurement,and psychophysics[M]//Handbook of Experimental Psychology.London:Wiley,1951:1-49. [23]ZHANG L P,ZHAO J Q,LI S,et al.Research on Methods of Construction of Voronoi Diagram and Nearest Neighbor Query in Constrained Regions[J].Computer Science,2014,41(9):220-224. [24]VINH N X,EPPS J,BAILEY J.Information theoretic measures for clusterings comparison:Variants,properties,normalization and correction for chance[J].Journal of Machine Learning Research,2010,11:2837-2854. [25]FOWLKES E B,MALLOWS C L.A method for comparing two hierarchical clusterings[J].Journal of the American Statistical Association,1983,78(383):553-569. [26]YIN S Z,WANG T,CHEN Q C,et al.A Class of Classification Algorithm for Binary Protocol Based on Adjusting Mutual Information[J].Ordnance Industry Automation,2020,39(6):37-41. [27]WEI Y,ZHANG Z J,HE K L,et al.Density Peak Clustering Algorithm Based on Relative Density[J].Computer Enginee-ring,2023,49(6):53-61. [28]FISHER R A.The UCI Machine Learning Repository[EB/OL].[1988-06-30].https://archive.ics.uci.edu/dataset/53/iris. [29]ALCOCK R.The UCI Machine Learning Repository[EB/OL].[1999-06-07].https://archive.ics.uci.edu/dataset/139/synthetic+control+chart+time+series. [30]ROSS Q.The UCI Machine Learning Repository[EB/OL].[1986-12-31].https://archive.ics.uci.edu/dataset/102/thyroid+disease. [31]STREET W, WOLBERG W,MANGASARIAN O.The UCIMachine Learning Repository[EB/OL].[1995-10-31].https://archive.ics.uci.edu/dataset/17/breast+cancer+wisconsin+diagnostic. [32]ALPAYDIN E,KAYNAK C.The UCI Machine Learning Repository[EB/OL].[1998-06-30].https://archive.ics.uci.edu/dataset/80/optical+recognition+of+handwritten+digits. [33]CHAPMAN D,JAIN A.The UCI Machine Learning Repository[EB/OL].[1994-09-11].https://archive.ics.uci.edu/dataset/75/musk+version+2. [34]MARKELLE K,RACHEL L,KOLBY N.The UCI MachineLearning Repository[EB/OL].[2019-10-05].https://archive.ics.uci.edu/dataset/545/rice+cammeo+and+osmancik. [35]ALPAYDIN E,ALIMOGLU F.The UCI Machine Learning Repository[EB/OL].[1998-06-30].https://archive.ics.uci.edu/dataset/81/pen+based+recognition+of+handwritten+digits. |
|