Computer Science ›› 2025, Vol. 52 ›› Issue (11A): 241200068-11.doi: 10.11896/jsjkx.241200068

• Big Data & Data Science • Previous Articles     Next Articles

Tensor Graph Diffusion Share Nearest Neighbor Density Peaks Clustering

LIU Qiaoming1,2,3, WEI Qianran2, LI Zhi2, WANG Jian4, LI Yuanfang2,3   

  1. 1 School of Artifical Intelligence,Henan University,Zhengzhou 450001,China
    2 Zhengzhou Research Institute,Harbin Institute of Technology,Zhengzhou 450001,China
    3 School of Medicine and Health,Harbin Institute of Technology,Harbin 150001,China
    4 School of Computer and Information Engineering,Henan University of Economics and Law,Zhengzhou 450001,China
  • Online:2025-11-15 Published:2025-11-10
  • Supported by:
    National Natural Science Foundation of China(62402145),China Postdoctoral Science Foundation under General Program(GZC20233467),Postdoctoral Fellowship Program of CPSF(2023M740932) and Natural Science Foundation of Henan Province,China(242300421705).

Abstract: Density Peak Clustering(DPC) is based on the idea of density clustering.When dealing with high-dimensional data,the DPC algorithm has the problem of the “clustering” effect and “domino” effect in the process of similarity calculation and cluster distribution respectively,which limits the analysis efficiency of DPC in practical application.To solve these problems,a Tensor Graph Diffusion Share Nearest Neighbor Density Peaks Clustering(TGD-SNN-DPC) is proposed.Firstly,an adaptive tensor graph construction module is designed based on tensor graph theory to mine diverse local neighborhood information among data points.On this basis,an efficient tensor graph diffusion learning module is proposed and an efficient update strategy is introduced.On the premise of not increasing the computing burden of the model,this module is used to mine the global high-level topological information of the data,and the adjacency similarity information between samples with stronger robustness is obtained by using the above two modules.Finally,the adaptive shared neighborhood clustering module is designed.Based on the high-order adjacency matrix generated by tensor graph diffusion,the local density and relative distance based on shared neighbor information are introduced,and the adaptive neighborhood non-center sample allocation strategy is designed to improve the accuracy of the model matrix.Experimental results on 6 synthetic datasets and 12 real UCI datasets show that TGD-SNN-DPC algorithm outperforms the benchmark algorithm in clustering.

Key words: Clustering analysis, Density peak clustering, Tensor graph, Diffusion process, Shared nearest neighbors

CLC Number: 

  • TP391.41
[1]DENG Z,CHOI K S,JIANG Y,et al.A survey on soft subspaceclustering [J].Information Sciences,2016,348:84-106.
[2]JAIN A K.Data clustering:50 years beyond K-means [J].Pattern Recognition Letters,2010,31(8):651-666.
[3]PARK H S,JUN C H.A simple and fast algorithm for K-me-doids clustering [J].Expert Systems with Applications,2009,36(2):3336-3341.
[4]YANG M S,LAI C Y,LIN C Y.A robust EM clustering algorithm for Gaussian mixture models [J].Pattern Recognition,2012,45(11):3950-3961.
[5]CHEN X,CAI D.Large scale spectral clustering with landmark-based representation[C]//Proceedings of the Twenty-fifth AAAI Conference on Artificial Intelligence.2011.
[6]GUHA S,RASTOGI R,SHIM K.CURE:An efficient cluste-ring algorithm for large databases [J].ACM Sigmod Record,1998,27(2):73-84.
[7]KARYPIS G,HAN E H,KUMAR V.Chameleon:Hierarchical clustering using dynamic modeling [J].Computer,1999,32(8):68-75.
[8]KHAN K,REHMAN S U,AZIZ K,et al.DBSCAN:Past,present and future[C]//Proceedings of the Fifth International Conference on the Applications of Digital Information and Web Technologies(ICADIWT 2014).IEEE,2014.
[9]ANKERST M,BREUNIG M M,KRIEGEL H P,et al.OP-TICS:Ordering points to identify the clustering structure [J].ACM Sigmod Record,1999,28(2):49-60.
[10]FORSTER A,MURPHY A L.CLIQUE:Role-free clusteringwith Q-learning for wireless sensor networks[C]//Proceedings of the 2009 29th IEEE International Conference on Distributed Computing Systems.IEEE,2009.
[11]TAO Z,LI J,FU H,et al.From Ensemble Clustering to Subspace Clustering:Cluster Structure Encoding [J].IEEE Transactions on Neural Networks and Learning Systems,2021,34(5):2670-2681.
[12]LIKAS A,VLASSIS N,VERBEEK J J.The global k-meansclustering algorithm [J].Pattern Recognition,2003,36(2):451-461.
[13]RODRIGUEZ A,LAIO A.Clustering by fast search and find of density peaks [J].Science,2014,344(6191):1492-1496.
[14]ZHANG Z,ZHU Q,ZHU F,et al.Density decay graph-baseddensity peak clustering [J].Knowledge-Based Systems,2021,224:107075.
[15]GUAN J,LI S,HE X,et al.Peak-graph-based fast density peak clustering for image segmentation [J].IEEE Signal Processing Letters,2021,28:897-901.
[16]SHI Y,CHEN Z,QI Z,et al.A novel clustering-based imagesegmentation via density peaks algorithm with mid-level feature [J].Neural Computing and Applications,2017,28:29-39.
[17]ANANDARAO S,CHELLASAMY S H.Detection of Hot To-pic in Tweets Using Modified Density Peak Clustering [J].Ingénierie Des Systèmes D’Information,2021,26(6):523-531.
[18]PENG D,GUI Z,WANG D,et al.Clustering by measuring localdirection centrality for data with heterogeneous density and weak connectivity [J].Nature Communications,2022,13(1):5455.
[19]KAUSAR S,MEHMOOD R,IQBAL M S,et al.Density peaks based clustering for single-cell interpretation via multikernel learning [J].Procedia Computer Science,2019,147:71-76.
[20]SUN L,QIN X,DING W,et al.Nearest neighbors-based adaptive density peaks clustering with optimized allocation strategy [J].Neurocomputing,2022,473:159-181.
[21]GUO W,WANG W,ZHAO S,et al.Density peak clusteringwith connectivity estimation [J].Knowledge-Based Systems,2022,243:108501.
[22]SUN L,QIN X,DING W,et al.Density peaks clustering based on k-nearest neighbors and self-recommendation [J].International Journal of Machine Learning and Cybernetics,2021,12:1913-1938.
[23]LAOHAKIAT S,SA-ING V.An incremental density-basedclustering framework using fuzzy local clustering [J].Information Sciences,2021,547:404-426.
[24]YU H,CHEN L,YAO J.A three-way density peak clustering method based on evidence theory [J].Knowledge-Based Systems,2021,211:106532.
[25]HOU J,ZHANG A,QI N.Density peak clustering based on re-lative density relationship [J].Pattern Recognition,2020,108:107554.
[26]XU X,DING S,WANG L,et al.A robust density peaks clustering algorithm with density-sensitive similarity [J].Knowledge-Based Systems,2020,200:106028.
[27]TAO X,GUO W,REN C,et al.Density peak clustering using global and local consistency adjustable manifold distance [J].Information Sciences,2021,577:769-804.
[28]DE LATHAUWER L.A survey of tensor methods[C]//Proceedings of the 2009 IEEE International Symposium on Circuits and Systems.IEEE,2009.
[29]PANAGAKIS Y,KOSSAIFI J,CHRYSOS G G,et al.Tensor methods in computer vision and deep learning [C]//Proceedings of the IEEE.2021:863-890.
[30]SAHA P K.Tensor scale:A local morphometric parameter with applications to computer vision and image processing [J].Computer Vision and Image Understanding,2005,99(3):384-413.
[31]BOUCHARD G,NARADOWSKY J,RIEDEL S,et al.Matrix and tensor factorization methods for natural language processing[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing:Tutorial Abstracts.2015.
[32]CHANG K W,YIH W T,YANG B,et al.Typed tensor decomposition of knowledge bases for relation extraction[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP).2014.
[33]PAPALEXAKIS E E,FALOUTSOS C,SIDIROPOULOS N D.Tensors for data mining and data fusion:Models,applications,and scalable algorithms [J].ACM Transactions on Intelligent Systems and Technology(TIST),2016,8(2):1-44.
[34]KOLDA T G,SUN J.Scalable tensor decompositions for multi-aspect data mining[C]//Proceedings of the 2008 Eighth IEEE International Conference on Data Mining.IEEE,2008.
[35]WANG J,XU C,ZHANG J,et al.Big data analytics for intelligent manufacturing systems:A review [J].Journal of Manufacturing Systems,2022,62:738-752.
[36]WANG X,YANG L T,SONG L,et al.A tensor-based multiattributes visual feature recognition method for industrial intelligence [J].IEEE Transactions on Industrial Informatics,2020,17(3):2231-2241.
[37]COIFMAN R R,LAFON S.Diffusion maps [J].Applied andComputational Harmonic Analysis,2006,21(1):5-30.
[38]VAN DIJK D,SHARMA R,NAINYS J,et al.Recovering Gene Interactions from Single-Cell Data Using Data Diffusion [J].Cell,2018,174(3):716-729.
[39]TALMON R,COHEN I,GANNOT S,et al.Diffusion Maps for Signal Processing:A Deeper Look at Manifold-Learning Techniques Based on Kernels and Graphs [J].IEEE Signal Proces-sing Magazine,2013,30(4):75-86.
[40]YANG X,PRASAD L,LATECKI L J.Affinity learning with diffusion on tensor product graph [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,35(1):28-38.
[41]LIU R,WANG H,YU X.Shared-nearest-neighbor-based clustering by fast search and find of density peaks [J].Information Sciences,2018,450:200-226.
[42]WANG H,WANG W,YANG J,et al.Clustering by patternsimilarity in large data sets[C]//Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data.2002.
[43]WANG Y,WANG D,PANG W,et al.A systematic density-based clustering method using anchor points [J].Neurocompu-ting,2020,400:352-370.
[44]DU M,DING S,JIA H.Study on density peaks clustering based on k-nearest neighbors and principal component analysis [J].Knowledge-Based Systems,2016,99:135-145.
[45]XIE J,GAO H,XIE W,et al.Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors [J].Information Sciences,2016,354:19-40.
[1] WANG Xingeng, DU Tao, ZHOU Jin, CHEN Di, WU Yunzheng. Adaptive Density Peak Clustering Algorithm Based on Shared Nearest Neighbor [J]. Computer Science, 2024, 51(8): 97-105.
[2] CAO Dongtao, SHU Wenhao, QIAN Jin. Feature Selection Algorithm Based on Rough Set and Density Peak Clustering [J]. Computer Science, 2023, 50(10): 37-47.
[3] CAO Su-e, YANG Ze-min. Prediction of Wireless Network Traffic Based on Clustering Analysis and Optimized Support Vector Machine [J]. Computer Science, 2020, 47(8): 319-322.
[4] HUANG Hai-yan, LIU Xiao-ming, SUN Hua-yong, YANG Zhi-cai. Application of Clustering Analysis Algorithm in Uncertainty Decision Making [J]. Computer Science, 2019, 46(6A): 593-597.
[5] UO Min-xia,XUE Bo-huan. Research on Splicing Recovery of Broken Files Based on Intelligent Algorithms
H
[J]. Computer Science, 2018, 45(6A): 174-178.
[6] HAO Jun-sheng,LI Bing-feng,CHEN Xi,GAO Wen-juan. Design and Implementation of Network Subscription System Based on Android Platform [J]. Computer Science, 2018, 45(6A): 591-594.
[7] WANG Zhen-chao, HOU Huan-huan and LIAN Rui. Path Optimization Scheme for Restraining Degree of Disorder in CMT [J]. Computer Science, 2018, 45(4): 122-125.
[8] LIU Yi-zhi, CHENG Ru-feng and LIANG Yong-quan. Clustering Algorithm Based on Shared Nearest Neighbors and Density Peaks [J]. Computer Science, 2018, 45(2): 125-129.
[9] XIA Qing-ya. Automatically Selecting Clustering Centers Algorithm Based on Density Peak and Grid [J]. Computer Science, 2017, 44(Z11): 403-406.
[10] SUN Shanwu, WANG Nan and OUYANG Dantong. Business Process Model Abstraction Based on Cluster Analysis [J]. Computer Science, 2016, 43(5): 193-197.
[11] ZHANG Heng-wei, HE Jia-jing, HAN Ji-hong and WANG Jin-dong. Fuzzy Soft Subspace Clustering Method Based on Intelligent Optimization Algorithm [J]. Computer Science, 2016, 43(3): 256-261.
[12] PANG Tian-jie and ZHAO Xing-wang. Algorithm to Determine Number of Clusters for Mixed Data Based on Prior Information [J]. Computer Science, 2016, 43(2): 101-104.
[13] ZHANG Hao, HE Jie and LI Hui-zong. Social Tagging Recommendation Model Based on Improved Artificial Fish Swarm Algorithm and Tensor Decomposition [J]. Computer Science, 2016, 43(12): 168-172.
[14] FANG Ling and CHEN Song-can. Research on Clustering with Feature Preferences [J]. Computer Science, 2015, 42(5): 57-61.
[15] SHI Zhao and FU Zhuo. Distribution Location-routing Problem of Heterotypic Vehicles and its Algorithms [J]. Computer Science, 2015, 42(5): 245-250.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!