计算机科学 ›› 2025, Vol. 52 ›› Issue (11A): 241200068-11.doi: 10.11896/jsjkx.241200068
刘翘铭1,2,3, 魏千然2, 李智2, 王健4, 李远方2,3
LIU Qiaoming1,2,3, WEI Qianran2, LI Zhi2, WANG Jian4, LI Yuanfang2,3
摘要: 密度峰值聚类(Density Peak Clustering,DPC)是一种基于密度划分思想的聚类分析方法。在处理高维数据时,DPC算法在相似度计算过程与聚类分配过程中分别存在“聚集”效应问题和“多米诺”效应问题,限制了DPC在实际应用中的分析效率。针对以上问题,提出基于张量图扩散的共享近邻密度峰值聚类算法TGD-SNN-DPC,该算法首先基于张量图理论设计张量图自适应构建模块,挖掘数据点间多样性局部邻域信息。在此基础上,提出高效张量图扩散学习模块,引入张量图高效更新策略,在不增加模型计算负担的前提下,利用该模块挖掘数据全局高阶拓扑信息,利用以上两个模块获得合理的鲁棒性更强的样本间邻接相似度信息。设计自适应共享邻域聚类模块,以张量图扩散高阶邻接矩阵为基础,引入基于共享近邻信息的样本局部密度与相对距离,利用自适应邻域非聚类中心样本分配策略,提升模型矩阵的准确性。在6个合成数据集和12个真实UCI数据集上的实验表明TGD-SNN-DPC算法在准确度(ACC)、调整兰德系数(ARI)和标准互信息(NMI)方面均优于基准算法。
中图分类号:
| [1]DENG Z,CHOI K S,JIANG Y,et al.A survey on soft subspaceclustering [J].Information Sciences,2016,348:84-106. [2]JAIN A K.Data clustering:50 years beyond K-means [J].Pattern Recognition Letters,2010,31(8):651-666. [3]PARK H S,JUN C H.A simple and fast algorithm for K-me-doids clustering [J].Expert Systems with Applications,2009,36(2):3336-3341. [4]YANG M S,LAI C Y,LIN C Y.A robust EM clustering algorithm for Gaussian mixture models [J].Pattern Recognition,2012,45(11):3950-3961. [5]CHEN X,CAI D.Large scale spectral clustering with landmark-based representation[C]//Proceedings of the Twenty-fifth AAAI Conference on Artificial Intelligence.2011. [6]GUHA S,RASTOGI R,SHIM K.CURE:An efficient cluste-ring algorithm for large databases [J].ACM Sigmod Record,1998,27(2):73-84. [7]KARYPIS G,HAN E H,KUMAR V.Chameleon:Hierarchical clustering using dynamic modeling [J].Computer,1999,32(8):68-75. [8]KHAN K,REHMAN S U,AZIZ K,et al.DBSCAN:Past,present and future[C]//Proceedings of the Fifth International Conference on the Applications of Digital Information and Web Technologies(ICADIWT 2014).IEEE,2014. [9]ANKERST M,BREUNIG M M,KRIEGEL H P,et al.OP-TICS:Ordering points to identify the clustering structure [J].ACM Sigmod Record,1999,28(2):49-60. [10]FORSTER A,MURPHY A L.CLIQUE:Role-free clusteringwith Q-learning for wireless sensor networks[C]//Proceedings of the 2009 29th IEEE International Conference on Distributed Computing Systems.IEEE,2009. [11]TAO Z,LI J,FU H,et al.From Ensemble Clustering to Subspace Clustering:Cluster Structure Encoding [J].IEEE Transactions on Neural Networks and Learning Systems,2021,34(5):2670-2681. [12]LIKAS A,VLASSIS N,VERBEEK J J.The global k-meansclustering algorithm [J].Pattern Recognition,2003,36(2):451-461. [13]RODRIGUEZ A,LAIO A.Clustering by fast search and find of density peaks [J].Science,2014,344(6191):1492-1496. [14]ZHANG Z,ZHU Q,ZHU F,et al.Density decay graph-baseddensity peak clustering [J].Knowledge-Based Systems,2021,224:107075. [15]GUAN J,LI S,HE X,et al.Peak-graph-based fast density peak clustering for image segmentation [J].IEEE Signal Processing Letters,2021,28:897-901. [16]SHI Y,CHEN Z,QI Z,et al.A novel clustering-based imagesegmentation via density peaks algorithm with mid-level feature [J].Neural Computing and Applications,2017,28:29-39. [17]ANANDARAO S,CHELLASAMY S H.Detection of Hot To-pic in Tweets Using Modified Density Peak Clustering [J].Ingénierie Des Systèmes D’Information,2021,26(6):523-531. [18]PENG D,GUI Z,WANG D,et al.Clustering by measuring localdirection centrality for data with heterogeneous density and weak connectivity [J].Nature Communications,2022,13(1):5455. [19]KAUSAR S,MEHMOOD R,IQBAL M S,et al.Density peaks based clustering for single-cell interpretation via multikernel learning [J].Procedia Computer Science,2019,147:71-76. [20]SUN L,QIN X,DING W,et al.Nearest neighbors-based adaptive density peaks clustering with optimized allocation strategy [J].Neurocomputing,2022,473:159-181. [21]GUO W,WANG W,ZHAO S,et al.Density peak clusteringwith connectivity estimation [J].Knowledge-Based Systems,2022,243:108501. [22]SUN L,QIN X,DING W,et al.Density peaks clustering based on k-nearest neighbors and self-recommendation [J].International Journal of Machine Learning and Cybernetics,2021,12:1913-1938. [23]LAOHAKIAT S,SA-ING V.An incremental density-basedclustering framework using fuzzy local clustering [J].Information Sciences,2021,547:404-426. [24]YU H,CHEN L,YAO J.A three-way density peak clustering method based on evidence theory [J].Knowledge-Based Systems,2021,211:106532. [25]HOU J,ZHANG A,QI N.Density peak clustering based on re-lative density relationship [J].Pattern Recognition,2020,108:107554. [26]XU X,DING S,WANG L,et al.A robust density peaks clustering algorithm with density-sensitive similarity [J].Knowledge-Based Systems,2020,200:106028. [27]TAO X,GUO W,REN C,et al.Density peak clustering using global and local consistency adjustable manifold distance [J].Information Sciences,2021,577:769-804. [28]DE LATHAUWER L.A survey of tensor methods[C]//Proceedings of the 2009 IEEE International Symposium on Circuits and Systems.IEEE,2009. [29]PANAGAKIS Y,KOSSAIFI J,CHRYSOS G G,et al.Tensor methods in computer vision and deep learning [C]//Proceedings of the IEEE.2021:863-890. [30]SAHA P K.Tensor scale:A local morphometric parameter with applications to computer vision and image processing [J].Computer Vision and Image Understanding,2005,99(3):384-413. [31]BOUCHARD G,NARADOWSKY J,RIEDEL S,et al.Matrix and tensor factorization methods for natural language processing[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing:Tutorial Abstracts.2015. [32]CHANG K W,YIH W T,YANG B,et al.Typed tensor decomposition of knowledge bases for relation extraction[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP).2014. [33]PAPALEXAKIS E E,FALOUTSOS C,SIDIROPOULOS N D.Tensors for data mining and data fusion:Models,applications,and scalable algorithms [J].ACM Transactions on Intelligent Systems and Technology(TIST),2016,8(2):1-44. [34]KOLDA T G,SUN J.Scalable tensor decompositions for multi-aspect data mining[C]//Proceedings of the 2008 Eighth IEEE International Conference on Data Mining.IEEE,2008. [35]WANG J,XU C,ZHANG J,et al.Big data analytics for intelligent manufacturing systems:A review [J].Journal of Manufacturing Systems,2022,62:738-752. [36]WANG X,YANG L T,SONG L,et al.A tensor-based multiattributes visual feature recognition method for industrial intelligence [J].IEEE Transactions on Industrial Informatics,2020,17(3):2231-2241. [37]COIFMAN R R,LAFON S.Diffusion maps [J].Applied andComputational Harmonic Analysis,2006,21(1):5-30. [38]VAN DIJK D,SHARMA R,NAINYS J,et al.Recovering Gene Interactions from Single-Cell Data Using Data Diffusion [J].Cell,2018,174(3):716-729. [39]TALMON R,COHEN I,GANNOT S,et al.Diffusion Maps for Signal Processing:A Deeper Look at Manifold-Learning Techniques Based on Kernels and Graphs [J].IEEE Signal Proces-sing Magazine,2013,30(4):75-86. [40]YANG X,PRASAD L,LATECKI L J.Affinity learning with diffusion on tensor product graph [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,35(1):28-38. [41]LIU R,WANG H,YU X.Shared-nearest-neighbor-based clustering by fast search and find of density peaks [J].Information Sciences,2018,450:200-226. [42]WANG H,WANG W,YANG J,et al.Clustering by patternsimilarity in large data sets[C]//Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data.2002. [43]WANG Y,WANG D,PANG W,et al.A systematic density-based clustering method using anchor points [J].Neurocompu-ting,2020,400:352-370. [44]DU M,DING S,JIA H.Study on density peaks clustering based on k-nearest neighbors and principal component analysis [J].Knowledge-Based Systems,2016,99:135-145. [45]XIE J,GAO H,XIE W,et al.Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors [J].Information Sciences,2016,354:19-40. |
|
||