Computer Science ›› 2015, Vol. 42 ›› Issue (12): 247-250.

Previous Articles     Next Articles

Novel Global Kmeans Clustering Algorithm for Big Data

LI Bin, WANG Jin-song and HUANG Wei   

  • Online:2018-11-14 Published:2018-11-14

Abstract: The clustering method for big data has attracted lots of interest in recent years.This paper proposed a novel global k-means clustering algorithm (NGKCA).The proposed clustering method comprises four phrases,namely row dimension reduction phrase,line dimension reduction phrase,global k-means clustering phrase and the adjustment of clustering center point.The row dimension reduction phrase is realized by means of spectral clustering method,while the line dimension reduction phrase is realized with the aid of particle swarm optimization.Both the row dimension reduction phrase and the line dimension reduction phrase are completed,and then the global k-means clustering phrase and the PSO phrase proceed.The experiments were carried out on some well-known machine learning data set and a standard network security data set KDDCUP99.Experimental results show that the proposed NGKCA leads to superior perfor-mance in comparison with some common algorithms reported in the literature and the time complexity of the NGKCA is better than the algorithm of global k-means.

Key words: Global Kmeans,Spectral clustering,PSO,Clustering,KDDCUP99

[1] Li M J,Ng M K,et al.Agglomerative fuzzy K-means clustering algorithm with selection of number of clusters[J].IEEE Transactions on Knowledge and Data Engineering,2008,20(11):1519-1534
[2] Tou J T,Gonzalez R C.Pattern recognition principle [M].Addison Wesley,1974
[3] 姜大庆,夏士雄,周勇.基于半监督自动谱聚类算法的网络故障检测[J].计算机工程与应用,2012,8(30):89-94 Jiang Da-qing,Xia Shi-xiong,Zhou Yong.Network fault detection based on semi-supervised automatic spectral clustering algorithm[J].Computer Engineering and Applications,2012,8(30):89-94
[4] 周文刚,陈雷霆,董仕.基于谱聚类的网络流量分类识别算法[J].电子测量与仪器学报,2013,7(12):1114-1119 Zhou Wen-gang,Chen Lei-ting,Dong Shi.Network traffic classification algorithm based on spectral clustering[J].Journal of Electronic Measurement and Instrument,2013,7(12):1114-1119
[5] 刘婧明,韩丽川,侯丽文.基于粒子群的K均值聚类算法[J].系统工程理论与实践,2005(6):54-58 Liu Jing-ming,Han Li-chuan,Hou Li-wen.Cluster Analysis Based on Particle Swarm Optimization Algorithm[J].Systems Engineering--Theory & Practice,2005,5(6):54-58
[6] 张宇,吴昊,陈怀新.一种新的基于粒子群密度的聚类算法[J].电讯技术,2008,8(8):17-21 Zhang Yu,Wu Hao,Chen Huai-xin.A Novel Particle Swarm Optimization Clustering Algorithm Based on Density[J].Telecommunication Engineering,2008,8(8):17-21
[7] 夏奇,郝顺义,董淼,等.新的改进K均值粒子群算法在组合导航的应用[J].计算机应用,2014,4(5):1397-1399,1412 Xia Qi,Hao Shun-yi,Dong Miao,et al.Application of novel K-means particle swarm optimization algorithm in integrated navigation[J].Journal of Computer Applications,2014,4(5):1397-1399,1412
[8] 施培蓓,郭玉堂,胡玉娟,等.初始化独立的谱聚类算法[J].计算机工程与应用,2010,6(25):134-137 Shi Pei-bei,Guo Yu-tang,Hu Yu-juan,et al.Initialization independent spectral clustering algorithm[J].Computer Enginee-ring and Applications,2010,6(25):134-137
[9] 谢皝,张平伟,罗晟.基于全局K-means的谱聚类算法[J].计算机应用,2010,0(7):1936-1937,1940 Xie Huang,Zhang Ping-wei,Luo Sheng.Spectral clustering based on global K-means[J].Journal of Computer Applications,2010,0(7):1936-1937,1940
[10] Ng A Y,Jordan M I,Weiss Y.On spectral clustering:Analysis and an algorithm[C]∥Advances in Neural Information Proces-sing Systems.Cambridge,MA:MIT Press,2001:856-897

No related articles found!
Full text



[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75 .
[2] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[3] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[4] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[5] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99 .
[6] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105 .
[7] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111 .
[8] GENG Hai-jun, SHI Xin-gang, WANG Zhi-liang, YIN Xia and YIN Shao-ping. Energy-efficient Intra-domain Routing Algorithm Based on Directed Acyclic Graph[J]. Computer Science, 2018, 45(4): 112 -116 .
[9] CUI Qiong, LI Jian-hua, WANG Hong and NAN Ming-li. Resilience Analysis Model of Networked Command Information System Based on Node Repairability[J]. Computer Science, 2018, 45(4): 117 -121 .
[10] WANG Zhen-chao, HOU Huan-huan and LIAN Rui. Path Optimization Scheme for Restraining Degree of Disorder in CMT[J]. Computer Science, 2018, 45(4): 122 -125 .