Computer Science ›› 2014, Vol. 41 ›› Issue (Z6): 406-408.

Previous Articles     Next Articles

Novel Anti-noise K-means Algorithm Based on Spatial Distance Difference

MA Shi-yu,LI Yi-cai and LAN Zhang-li   

  • Online:2018-11-14 Published:2018-11-14

Abstract: Due to the inherent initial clustering center sensitivity of K-means algorithm,it exists problems including result instability and being easy to fall into local optimum.The current improvement schemes can reduce the number of iteration and obtain an approximate global optimal solution when deal with noise-free data sets.But for noisy data sets,it would be easy to fall into local optimum,and the clustering result is lower than traditional K-means algorithm.Based on the algorithm that can find initial clustering centers according to the farthest spatial distance,the paper proposed a novel algorithm to select initial centers based on spatial distance difference.The main idea of the algorithm is calculating the sum distances between non-clustering center and all selected centers,then sort them.Choose the point which is the closer to the given centers as the new selected cluster center.Experimental results show that under the quite condition of iteration,when deal with noise-free data sets,the clustering accuracy of the proposed algorithm is improved about 1%.For noisy data sets,the classified accuracy is above 90%.

Key words: K-means algorithm,Initial centroid,Spatial distance difference,Noisy data

[1] 欧陈委.K均值聚类算法的研究与改进[D].长沙:长沙理工大学,2011
[2] 张俊生.数据挖掘中的聚类方法及其应用研究[D].天津:天津理工大学,2012
[3] Anil K J.Data clustering:50year beyond K-Means[J].Pattern Recognition Letters,2010,31(08):651-666
[4] 吴晓蓉.K-均值聚类算法初始质心选取相关问题的研究[D].长沙:湖南大学,2008
[5] Fayyad U,Reina C,Bradley P S.Initialization of Iterative Refinement Clustering Algorithms[C]∥Proc of the Fourth International Conference on Knowledge Discovery and Data Mining.1998:194-198
[6] Adil M,Bafirov,Julien U.Fast modified global k-means algorithm for incremental cluster construction[J].Pattern Recognition,2011,44(4):866-876
[7] 曹志宇,张忠林,李元韬.快速查找初始聚类质心的K_means算法 [J].兰州交通大学学报,2009,28(6):15-18
[8] 张真,任贺宇.一种基于动态网格技术的K-means初始质心选取算法[J].微电子学与计算机,2013,30(6):101-104
[9] 谢娟英,蒋帅,王春霞.一种改进的全局K-均值聚类算法[J].陕西师范大学学报,2010,38(2):18-22
[10] 谢娟英,郭文娟,谢维信.基于样本空间分布密度的初始聚类中心优化K-均值算法[J].计算机应用研究,2012,29(3):888-892
[11] 杨燕,靳蕃,Kamel M.聚类有效性评价综述[J].计算机应用研究,2008,25(6):1631-1632
[12] 张惟皎,刘春煌,李芳玉.聚类质量的评价方法[J].计算机工程,2005,31(20):10-12
[13] Hubert L,Arabie P.Comparing partitions [J].Journal of Classification,1985,2(1):193-218
[14] 王纵虎,刘志镜子,陈东辉.基于粒子群优化的模糊C-均值聚类算法研究[J].计算机科学,2012,39(9):166-169

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!