Computer Science ›› 2024, Vol. 51 ›› Issue (8): 97-105.doi: 10.11896/jsjkx.230500226

• Database & Big Data & Data Science • Previous Articles     Next Articles

Adaptive Density Peak Clustering Algorithm Based on Shared Nearest Neighbor

WANG Xingeng1, DU Tao1,2, ZHOU Jin1,2, CHEN Di1, WU Yunzheng1   

  1. 1 College of Information Science and Engineering,University of Jinan,Jinan 250024,China
    2 Shandong Provincial Key Laboratory of Network Based Intelligent Computing,Jinan 250024,China
  • Received:2023-05-31 Revised:2023-10-18 Online:2024-08-15 Published:2024-08-13
  • About author:WANG Xingeng,born in 1999,postgra-duate.His main research interests include data clustering and data mining.
    DU Tao,born in 1979,Ph.D,associate professor.His main research interests include data clustering and data mining.
  • Supported by:
    National Natural Science Foundation of China(62273164) and Joint Fund of Natural Science Foundation of Shandong Province,China(ZR2020LZH009).

Abstract: Density peak clustering algorithm(DPC) is a simple and efficient unsupervised clustering algorithm.Although the algorithm can automatically discover cluster centers and realize efficient clustering of arbitrary shape data,it still has some defects.Aiming at the three defects of density peak clustering algorithm,which does not consider the location information of data when defining the correlation value,the number of clustering centers needs to be set manually in advance,and the chain reaction is easy to occur when distributing sample points,an adaptive density peak clustering algorithm based on shared nearest neighbor is proposed.Firstly,the shared nearest neighbor is used to redefine the local density and other measures,and the local characteristics of data distribution are fully considered,so that the spatial distribution characteristics of sample points can be better reflected.Se-condly,by introducing the phenomenon of density attenuation,the sample points are automatically gathered into micro-clusters,which realizes the adaptive determination of cluster number and the adaptive selection of cluster center.Finally,a two-stage distribution method is proposed,in which the micro-clusters are merged to form the backbone of the cluster,and then the backbone of the cluster allocated in the previous step guides the distribution of the remaining points,avoiding the occurrence of chain reactions.The implementation on two dimensional composite datasets and UCI datasets shows that this algorithm has better perfor-mance in most cases than the classical density peak clustering algorithm and its improved algorithms in recent years.

Key words: Shared nearest neighbor, Density peak clustering, Allocation strategy, Cluster center, Density decay

CLC Number: 

  • TP391
[1]SUN L,QIN X Y,XU J C,et al.Density Peak Clustering Algorithm Based on K-Nearest Neighbors and Optimal Assignment Strategy[J].Journal of Software,2022,33(4):1390-141.
[2]CERQUITELLI T,VENTURA F,APILETTI D,et al.Enhancing manufacturing intelligence through an unsupervised data-driven methodology for cyclic industrial processes[J].Expert Systems with Applications,2021,182(3):115269.
[3]YANG L,CHEUNG Y M,YUAN Y T.Self-Adaptive Multiprototype-Based Competitive Learning Approach:Ak-Means-Type Algorithm for Imbalanced Data Clustering[J].IEEE Transactions on Cybernetics,2019,51(3):1598-1612.
[4]AHMAD A,KHAN S S.initKmix-A novel initial partitiongeneration algorithm for clustering mixed data using k-means-based clustering[J].Expert Systems with Applications,2020,167(2):114149.
[5]JIANG J T,ZHENG C H.Density Peak and Grid Based Clustering for Large-scale Node Partition[J].Journal of Chinese Mini-Micro Computer Systems,2022,43(3):498-505.
[6]SUN L,LIANG Y Q.Improved Clustering Algorithm FusingGrid Partition and DBSCAN[J].Computer Engineering and Applications,2022,58(14):73-79.
[7]HU C A,WANG J X,MAO Y M.Density-based clustering algorithm based on groups and improve gravitational search[J].Application Research of Computers,2021,38(11):3293-3299.
[8]GUO W J,WANG W H,ZHAO S P,et al.Density Peak Clustering with connectivity estimation[J].Knowledge-Based Systems,2022,243(5):108501.
[9]ZHANG T,RAMAKRISHNAN R,LIVNY M.BIRCH:an efficient data clustering method for very large databases[J].ACM Sigmod Record,1996,25(2):103-114.
[10]WANG R,ZHOU J,JIANG H,et al.A general transfer lear-ning-based Gaussian mixture modelfor clustering[J].Interna-tional Journal of Fuzzy Systems,2021,23(3):776-793.
[11]LI K,ZHANG K X.Structural α-Entropy Weighting Gaussian Mixture Model for Subspace Clustering[J].Chinese Journal of Electronics,2022,50(3):718-725.
[12]HARTIGAN J A,WONG M A.Algorithm AS 136:A K-Means Clustering Algorithm[J].Journal of the Royal Statistical Society,1979,28(1):100-108.
[13]ESTER M,KRIEGEL H P,SANDER J,et al.A density-based algorithm for discovering clusters in large spatial databases with noise[C]//Proceedings of the Second International Conference on Knowledge Discovery and Data Mining.1996:226-231.
[14]RODRIGUEZ A,LAIO A.Clustering by fast search and find of density peaks[J].Science,2014,344(6191):1492-1496.
[15]WEI L,GAO L,LI J H,et al.Traffic sub-area division method based on density peak clustering[J].Journal of Jilin University(Engineering and Technology Edition),2023,53(1):124-131.
[16]WANG F Y,ZHANG D S,XIAO Y T.Density Peak Algorithm Based on Weighted Shared Nearest Neighbor and Accumulated Sequence[J].Computer Engineering,2022,48(4):61-69.
[17]ZHANG X Y,YUN W G.Sharing K-nearest Neighbors andMultiple Assignment PoliciesDensity Peaks Clustering Algorithm[J].Journal of Chinese Computer Systems,2023,44(1):75-82.
[18]DU M,DING S,JIA H.Study on density peaks clustering based on k-nearest neighbors and principal component analysis[J].Knowledge-Based Systems,2016,99(5):135-145.
[19]JIANG J,CHEN Y,MENG X,et al.A novel density peaks clustering algorithm based on k nearest neighbors for improving assignment process[J].Physica A:Statistical Mechanics and Its Applications,2019,523(6):702-713.
[20]LIU R,WANG H,YU X.Shared-nearest-neighbor-based clustering by fast search and find of density peaks[J].Information Sciences,2018,450:200-226.
[21]CHEN J,YU P.A domain adaptive density clustering algorithm for data with varying density distribution[J].IEEE Transactions on Knowledge and Data Engineering,2021,33(6):2310-2321.
[22]LIU Y H,MA Z M,YU F.Adaptive density peak clustering based on K-nearest neighbors with aggregating strategy[J].Knowledge-Based Systems,2017,133(10):208-220.
[23]ZHANG Z,ZHU Q,ZHU F,et al.Density decay graph-based density peak clustering[J].Knowledge-Based Systems,2021,224:107075.
[24]XIE J,GAO H,XIE W,et al.Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors[J].Information Sciences,2016,354(8):19-40.
[25]SUN L,QIN X,DING W,et al.Nearest neighbors-based adaptive density peaks clustering with optimized allocation strategy[J].Neurocomputing,2022,473:159-181.
[26]GUO W,WANG W,ZHAO S,et al.Density peak clusteringwith connectivity estimation[J].Knowledge-Based Systems,2022,243:108501.
[1] XU Tianjie, WANG Pingxin, YANG Xibei. Three-way k-means Clustering Based on Artificial Bee Colony [J]. Computer Science, 2023, 50(6): 116-121.
[2] CAO Dongtao, SHU Wenhao, QIAN Jin. Feature Selection Algorithm Based on Rough Set and Density Peak Clustering [J]. Computer Science, 2023, 50(10): 37-47.
[3] ZHANG Ximei, XIE Bin, MI Jusheng, XU Tongtong, ZHANG Yiling. Adaptive Spectral Clustering Algorithm Combining Shared Nearest Neighbors and Manifold Distance [J]. Computer Science, 2023, 50(10): 59-70.
[4] MAO Sen-lin, XIA Zhen, GENG Xin-yu, CHEN Jian-hui, JIANG Hong-xia. FCM Algorithm Based on Density Sensitive Distance and Fuzzy Partition [J]. Computer Science, 2022, 49(6A): 285-290.
[5] ZHANG Ya-di, SUN Yue, LIU Feng, ZHU Er-zhou. Study on Density Parameter and Center-Replacement Combined K-means and New Clustering Validity Index [J]. Computer Science, 2022, 49(1): 121-132.
[6] QIAO Ying-jing, GAO Bao-lu, SHI Rui-xue, LIU Xuan, WANG Zhao-hui. Improved FCM Brain MRI Image Segmentation Algorithm Based on Tamura Texture Feature [J]. Computer Science, 2021, 48(8): 111-117.
[7] SHAO Xin-xin. Service Recommendation Algorithm Based on Canopy and Shared Nearest Neighbor [J]. Computer Science, 2020, 47(11A): 479-481.
[8] CHEN Chun-tao, CHEN You-guang. Influence Space Based Robust Fast Search and Density Peak Clustering Algorithm [J]. Computer Science, 2019, 46(11): 216-221.
[9] FENG Fei, LIU Pei-xue,LI Li,CHEN Yu-jie. Study of FCM Fusing ImprovedGravitational Search Algorithm in Medical Image Segmentation [J]. Computer Science, 2018, 45(6A): 252-254.
[10] LIU Yi-zhi, CHENG Ru-feng and LIANG Yong-quan. Clustering Algorithm Based on Shared Nearest Neighbors and Density Peaks [J]. Computer Science, 2018, 45(2): 125-129.
[11] DONG Xiao-jun, CHENG Chun-ling. K-CFSFDP Clustering Algorithm Based on Kernel Density Estimation [J]. Computer Science, 2018, 45(11): 244-248.
[12] LIU Jin-shuo, JIANG Zhuang-yi, XU Ya-bo, DENG Juan and ZHANG Lan-xin. Multithread and GPU Parallel Schema on Patch-based Multi-view Stereo Algorithm [J]. Computer Science, 2017, 44(2): 296-301.
[13] MA Chun-lai, SHAN Hong and MA Tao. Improved Density Peaks Based Clustering Algorithm with Strategy Choosing Cluster Center Automatically [J]. Computer Science, 2016, 43(7): 255-258.
[14] ZHANG Jian-peng,CHEN Fu-cai,LI Shao-mei and YU Hong-tao. Parallel Affinity Propagation Clustering Algorithm Based on Hybrid Measure [J]. Computer Science, 2013, 40(7): 167-172.
[15] ZHANG Hui-zhe WANG Jian (CIMS Research Center, Tongji University, Shanghai 201804, China). [J]. Computer Science, 2009, 36(6): 206-209.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!