Computer Science ›› 2019, Vol. 46 ›› Issue (11A): 216-219.

• Data Science •

### Nearest Neighbor Optimization k-means Clustering Algorithm

LIN Tao, ZHAO Can

1. (School of Computer Science and Engineering,Hebei University of Technology,Tianjin 300401,China)
• Online:2019-11-10 Published:2019-11-20

Abstract: Traditional k-means algorithms usually ignores the distribution of the data samples,assign all of them in the cluster edge position,center position,outliers to the cluster which nearest clustering center locates,in accordance with the principle of minimum distance,without considering the relationsh1ip between the data sample and other clusters.If the distance between the data sample and the other cluster is close to the minimum distance,the data sample is very close to the two clusters,obviously,the direct division menthod is not reasonable.Aiming at this problem,this paper presented a clustering algorithm optimized nearest neighbor (1NN-kmeans).Using the ideas of neighbor,assign these samples that do not firmly belong to a certain cluster to the cluster that the nearest neighbor sample belongs to.The experimental results show that 1NN effectively reduced the number of iterations and improved the clustering accuracy and finally achieved the better clustering results.

CLC Number:

• TP181
 [1]高曼,韩勇,陈戈,等.基于K-means聚类算法的公交行程速度计算模型[J].计算机科学,2016,43(S1):422-424,439.[2]赵建民,管国权,王红艳.基于遗传算法的硬聚类算法改进[J].计算机工程与科学,2008(8):83-85.[3]唐胡鑫.电子商务客户忠诚度模型仿真研究[J].计算机仿真,2016,33(1):413-415,424.[4]王勇,唐靖,饶勤菲,等.高效率的K-means最佳聚类数确定算法[J].计算机应用,2014,34(5):1331-1335.[5]谢娟英,王艳娥.最小方差优化初始聚类中心的K-means算法[J].计算机工程,2014,40(8):205-211,223.[6]郁启麟.K-means算法初始聚类中心选择的优化[J].计算机系统应用,2017,26(5):170-174.[7]邢长征,谷浩.基于平均密度优化初始聚类中心的k-means算法[J].计算机工程与应用,2014,50(20):135-138.[8]朴尚哲,超木日力格,于剑.模糊C均值算法的聚类有效性评价[J].模式识别与人工智能,2015,28(5):452-461.[9]马闯,吴涛,段梦雅.基于K近邻隶属度的聚类算法研究[J].计算机工程与应用,2016,52(10):55-58,117.[10]王超学,潘正茂,马春森,等.改进型加权KNN算法的不平衡数据集分类[J].计算机工程,2012,38(20):160-163,168.[11]华辉有,陈启买,刘海,等.一种融合Kmeans和KNN的网络入侵检测算法[J].计算机科学,2016,43(3):158-162.[12]苏毅娟,邓振云,程德波,等.大数据下的快速KNN分类算法[J].计算机应用研究,2016,33(4):1003-1006,1023.[13]ARTHUR D,VASSILVITSKII S.k-means++:the advantages of careful seeding[C]∥Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms.Society for Industrial and Applied Mathematics Philadelphia,PA,USA,2007:1027-1035.[14]余秀雅,刘东平,杨军.基于K-means++的无线传感网分簇算法研究[J].计算机应用研究,2017,34(1):181-185.[15]ASUNCION A,NEWMAN D J.UCI machine learning repository[EB/OL].[2009-12-23].http://archive.ics.uci.edu／.
 [1] MA Yu-yin, ZHENG Wan-bo, MA Yong, LIU Hang, XIA Yun-ni, GUO Kun-yin, CHEN Peng, LIU Cheng-wu. Multi-workflow Offloading Method Based on Deep Reinforcement Learning and ProbabilisticPerformance-awarein Edge Computing Environment [J]. Computer Science, 2021, 48(1): 40-48. [2] LIANG Wei, DUAN Xiao-dong, XU Jian-feng. Three-way Filtering Algorithm of Basic Clustering Based on Differential Measurement [J]. Computer Science, 2021, 48(1): 136-144. [3] ZHANG Yu, LU Yi-hong, HUANG De-cai. Weighted Hesitant Fuzzy Clustering Based on Density Peaks [J]. Computer Science, 2021, 48(1): 145-151. [4] CUI Tong-tong, WANG Gui-ling, GAO Jing. Ship Trajectory Classification Method Based on 1DCNN-LSTM [J]. Computer Science, 2020, 47(9): 175-184. [5] DONG Xin-yue, FAN Rui-dong, HOU Chen-ping. Active Label Distribution Learning Based on Marginal Probability Distribution Matching [J]. Computer Science, 2020, 47(9): 190-197. [6] WANG Bing-zhou, WANG Hui-bin, SHEN Jie, ZHANG Li-li. FastSLAM Algorithm Based on Adaptive Fading Unscented Kalman Filter [J]. Computer Science, 2020, 47(9): 213-218. [7] YOU Wen-jing, DONG Chao, WU Qi-hui. Survey of Layered Architecture in Large-scale FANETs [J]. Computer Science, 2020, 47(9): 226-231. [8] LIU Shuai, CHEN Jian-hua. Certificateless Signature Scheme Without Bilinear Pairings and Its Application in Distribution Network [J]. Computer Science, 2020, 47(9): 304-310. [9] YANG Fan, WANG Jun-bin, BAI Liang. Extended Algorithm of Pairwise Constraints Based on Security [J]. Computer Science, 2020, 47(9): 324-329. [10] CUI Xiang, LI Xiao-wen, CHEN Yi-feng. Communication Optimization Method of Heterogeneous Cluster Application Based on New Language Mechanism [J]. Computer Science, 2020, 47(8): 17-15. [11] YAO Cheng-liang, ZHU Qing-sheng. Label Distribution Learning Based on Natural Neighbors [J]. Computer Science, 2020, 47(8): 132-136. [12] DONG Ming-gang, HUANG Yu-yang, JING Chao. K-Nearest Neighbor Classification Training Set Optimization Method Based on Genetic Instance and Feature Selection [J]. Computer Science, 2020, 47(8): 178-184. [13] XU Shou-kun, NI Chu-han, JI Chen-chen, LI Ning. Image Caption of Safety Helmets Wearing in Construction Scene Based on YOLOv3 [J]. Computer Science, 2020, 47(8): 233-240. [14] ZHANG Hong-ying, SHEN Rong-miao, LUO Qian. Study on Optimal Scheduling of Gate Based on Mixed Integer Programming [J]. Computer Science, 2020, 47(8): 278-283. [15] CHEN Qing-chao, WANG Tao, FENG Wen-bo, YIN Shi-zhuang, LIU Li-jun. Unknown Binary Protocol Format Inference Method Based on Longest Continuous Interval [J]. Computer Science, 2020, 47(8): 313-318.
Viewed
Full text

Abstract

Cited

Shared
Discussed
 [1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75 . [2] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 . [3] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 . [4] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 . [5] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99 . [6] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105 . [7] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111 . [8] GENG Hai-jun, SHI Xin-gang, WANG Zhi-liang, YIN Xia and YIN Shao-ping. Energy-efficient Intra-domain Routing Algorithm Based on Directed Acyclic Graph[J]. Computer Science, 2018, 45(4): 112 -116 . [9] CUI Qiong, LI Jian-hua, WANG Hong and NAN Ming-li. Resilience Analysis Model of Networked Command Information System Based on Node Repairability[J]. Computer Science, 2018, 45(4): 117 -121 . [10] WANG Zhen-chao, HOU Huan-huan and LIAN Rui. Path Optimization Scheme for Restraining Degree of Disorder in CMT[J]. Computer Science, 2018, 45(4): 122 -125 .