基于影响空间的稳健密度峰值聚类算法

doi:10.11896/jsjkx.181001846

Abstract

Abstract: DP (Clustering by Fast Search and Find of Density Peaks) is a novel clustering algorithm based on local density and distance between data,which has the advantages of being able to discover the clusters of arbitrary shapes,easy to accept logic principle and efficiently partitioning data into data sets.However,this algorithm has some disadvantages,for instance,it can’t deal with the case when multiple density peaks co-exist in a single cluster,and it is characterized with the unstable class label allocation.At the same time,it can’t accurately identify the clusters of sparse data when the density difference between clusters is big enough.In order to overcome the above deficiencies,this paper proposed a robust density peak clustering algorithm based on influence space.The proposed algorithm is capable of calculating local density by neighboring data and enhances the ability to recognize small-scale clusters.In order to improve the robustness of data partition,the data influence space is introduced,a new symmetric relationship is defined,and a new allocation strategy is proposed.By calculating the weighted density ratio of the target data to the adjacent data,the algorithm is able to process data sets with multiple density distribution features.The local density is obtained by calculating the density ratio of the target data to the neighbor data,and the ratio is weighted through using the size of the influence space.In order to verify the availability of the proposed algorithm,simulation experiments were conducted on synthetic data sets and UCI data sets,and NMI and Acc indexs were used to evaluate clustering algorithms.Simulation experiment results show that the proposed algorithm can reduce the influence of the cutoff distance on the algorithm and classify the data more stably.Compared with other algorithms,the proposed algorithm achieves better experimental results on both NMI and Acc indexs.

Key words: Allocation strategy, Clustering algorithm, Influence space, Local density, Robustness

CLC Number:

TP391

CHEN Chun-tao, CHEN You-guang. Influence Space Based Robust Fast Search and Density Peak Clustering Algorithm[J].Computer Science, 2019, 46(11): 216-221.

References

[1]MACQUEEN J.Some Methods for Classification and Analysisof MultiVariate Observations[C]∥Berkeley Symposium on Mathematical Statistics and Probability.Berkeley:University of California Press,1966:281-297.
[2]KARYPIS G,HAN E H,KUMAR V.CHAMELEON.A hierarchical clustering algorithm using dynamic modeling[J].Computer,1999,32(8):68-75.
[3]ESTER M,KRIEGEL H P,XU X.A Density-based Algorithmfor Discovering Clusters in Large Spatial Databases with Noise[C]∥International Conference on Knowledge Discovery and Data Mining.Palo Alto:AAAI Press,1996:226-231.
[4]WANG W,YANG J,MUNTZ R R.STING:A Statis-tical Information Grid Approach to Spatial Data Mining[C]∥Interna-tional Conference onVery Large Data Bases.San Mateo:Morgan Kaufmann.1997:186-195.
[5]HU Q,WU J,BAI L,et al.Fast K-means for Large Scale Clustering[C]∥Conference on Information and Knowledge Management.New York:ACM,2017:2099-2102.
[6]PARK H S,JUN C H.A Simple and Fast Algo-rithm for K-medoids Clustering[J].Expert Systems with Applications,2009,36(2):3336-3341.
[7]HE Y,TAN H,LUO W,et al.MR-DBSCAN:An Efficient Parallel Density-Based Clustering Algorithm Using MapReduce[C]∥International Conference on Parallel and Distributed Systems.New York:IEEE,2012:473-480.
[8]JAHIRABADKAR S,KULKARNI P.Algorithm to Determine $\ell$-distance Parameter in Density based Clustering[J].Expert Systems with Ap-plications,2014,41(6):2939-2946.
[9]NANDA S J,PANDA G.Design of Computationally Efficient Density-based Clustering Algorithms[J].Data & Knowledge Engineering,2015,95:23-38.
[10]BAI X,YANG P,SHI X.An Overlapping Community Detection Algorithm Based on Density Peaks[J].Neurocomputing,2016,226(22):7-15.
[11]WANG H Y,ZHANG T F,MA F M.Rough K-means Algorithm with Self-adaptive Weights Measurement Based on Space Distance[J].Computer Science,2018,45(7):190-196.(in Chinese)
王慧研,张腾飞,马福民.基于空间距离自适应权重度量的粗糙K-means算法[J].计算机科学,2018,45(7):190-196.
[12]RODRIGUEZ A,LAIO A.Machine Learning.Clustering byFast Search and Find of Density Peaks[J].Science,2014,344(6191):1492-1496.
[13]ZHANG Y,XIA Y,LIU Y,et al.Clustering Sentences withDensity Peaks for Multidocument Summarization[C]∥Confe-rence of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2015:1262-1267.
[14]ZHANG Y,CHEN S.Efficient distributed density peaks forclustering large data sets in mapreduce[J].IEEE Transactions on Knowledge and Data Engineering,2016,28(12):3218-3230.
[15]XU J,WANG G,LI T,et al.Fat node leading tree for datastream clustering with density peaks[J].Knowledge-Based Systems,2017,120:99-117.
[16]CHEN Y,MA S,CHEN X,et al.Hyperspectral data clustering based on density analysis-ensemble[J].Remote Sensing Letters,2017,8(2):194-203.
[17]LI Z J,TANG Y C.Comparative density Peaks Clustering[J].Expert Systemswith Applications,2018,95:236-247.
[18]DING J,CHEN Z,HE X,et al.Clustering by Finding Density Peaks Based on Chebyshev’s inequality[C]∥ControlConfe-rence.New York:IEEE,2016:7169-7172.
[19]GAO S Y,ZHOU X F,LI S.Clustering by Fast Search and Find of Density Peaks Based on Density Ratio[J].Computer Engineering and Applications,2017,53(16):10-17.(in Chinese)
高诗莹,周晓锋,李帅.基于密度比例的密度峰值聚类算法[J].计算机工程与应用,2017,53(16):10-17.
[20]WANG S L,WANG D K,LI C Y,et al.Clustering by Fast Search and Find of Density Peaks with Data Field[J].Chinese Journal of Electronics,2016,25(3):397-402.
[21]LIU Y,MA Z,YU F.Adaptive Density Peak Clustering Based on K-nearest Neighbors with Aggregating Strategy[J].Know-ledge Based Systems,2017,133:208-220.
[22]JIN W,TUNG A K H,HAN J,et al.Ranking Outliers Using Symmetric Neighborhood Relationship[C]∥Pacific-Asia Conference on Knowledge Discovery and Data Mining.Berlin Heidelberg:Springer,2006:577-593.

Related Articles 15

[1]	CHAI Hui-min, ZHANG Yong, FANG Min. Aerial Target Grouping Method Based on Feature Similarity Clustering [J]. Computer Science, 2022, 49(9): 70-75.
[2]	ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[3]	YAN Meng, LIN Ying, NIE Zhi-shen, CAO Yi-fan, PI Huan, ZHANG Lan. Training Method to Improve Robustness of Federated Learning [J]. Computer Science, 2022, 49(6A): 496-501.
[4]	ZHANG Cheng-rui, CHEN Jun-jie, GUO Hao. Comparative Analysis of Robustness of Resting Human Brain Functional Hypernetwork Model [J]. Computer Science, 2022, 49(2): 241-247.
[5]	ZHANG Ya-di, SUN Yue, LIU Feng, ZHU Er-zhou. Study on Density Parameter and Center-Replacement Combined K-means and New Clustering Validity Index [J]. Computer Science, 2022, 49(1): 121-132.
[6]	MU Jun-fang, ZHENG Wen-ping, WANG Jie, LIANG Ji-ye. Robustness Analysis of Complex Network Based on Rewiring Mechanism [J]. Computer Science, 2021, 48(7): 130-136.
[7]	LI Shan, XU Xin-zheng. Parallel Pruning from Two Aspects for VGG16 Optimization [J]. Computer Science, 2021, 48(6): 227-233.
[8]	WANG Xue-guang, ZHANG Ai-xin, DOU Bing-lin. Non-linear Load Capacity Model of Complex Networks [J]. Computer Science, 2021, 48(6): 282-287.
[9]	TANG Xin-yao, ZHANG Zheng-jun, CHU Jie, YAN Tao. Density Peaks Clustering Algorithm Based on Natural Nearest Neighbor [J]. Computer Science, 2021, 48(3): 151-157.
[10]	WANG Mao-guang, YANG Hang. Risk Control Model and Algorithm Based on AP-Entropy Selection Ensemble [J]. Computer Science, 2021, 48(11A): 71-76.
[11]	WANG Wei-dong, XU Jin-hui, ZHANG Zhi-feng, YANG Xi-bei. Gaussian Mixture Models Algorithm Based on Density Peaks Clustering [J]. Computer Science, 2021, 48(10): 191-196.
[12]	ZHANG Yu, LU Yi-hong, HUANG De-cai. Weighted Hesitant Fuzzy Clustering Based on Density Peaks [J]. Computer Science, 2021, 48(1): 145-151.
[13]	TONG Xin, WANG Bin-jun, WANG Run-zheng, PAN Xiao-qin. Survey on Adversarial Sample of Deep Learning Towards Natural Language Processing [J]. Computer Science, 2021, 48(1): 258-267.
[14]	XU Shou-kun, NI Chu-han, JI Chen-chen, LI Ning. Image Caption of Safety Helmets Wearing in Construction Scene Based on YOLOv3 [J]. Computer Science, 2020, 47(8): 233-240.
[15]	WU Qing-hong, GAO Xiao-dong. Face Recognition in Non-ideal Environment Based on Sparse Representation and Support Vector Machine [J]. Computer Science, 2020, 47(6): 121-125.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Influence Space Based Robust Fast Search and Density Peak Clustering Algorithm

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0