基于内聚度和耦合度的二分K均值方法

Abstract

Abstract: Clustering analysis is one of the most important techniques in data mining.It has important role and wide application in every field of social economy.K-means is one kind of the simple and widely used clustering methods,but its disadvantage is that it depends on the initial conditions and the number of clusters is difficult to determine.This paper introduced the cohesion and coupling of cluster,and presented the measurement of cohesion and coupling.Based on the principle of “high cohesion and low coupling”,the clusters are constantly divided and merged in the process of bisecting K-Means clustering algorithm.By judging whether the clustering results meet the requirements,it can determine the number of clusters,thus improving the bisecting K-Means clustering algorithm.The experimental results on Iris data show that the algorithm is not only more stable,but also has higher clustering accuracy.

Key words: Bisecting K-means, Clustering, Cohesion, Coupling

CLC Number:

TP391

YU Yong,KANG Qing-yi,CHEN Chang-geng,KAN Shi-lin,LUO Yong-jun. Bisecting K-means Clustering Method Based on Cohesion and Coupling[J].Computer Science, 2018, 45(6A): 460-464.

References

[1]HAN J W,KAMBER M,PEI J.Data mining:concepts and techniques(3rd ed)[M].Burlington:Elsevier Science,2011.
[2]ILLHOI Y,HU X H.A comprehensive comparison study of document clustering for a biomedical digital library MEDLINE[C]∥Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries.New York,USA:ACM,2006:220-229.
[3]SILVA J D A,HRUSCHKA E R.Extending k-Means-Based Algorithms for Evolving Data Streams with Variable Number of Clusters[C]∥International Conference on Machine Learning and Applications and Workshops.2011:14-19.
[4]SAVARESI S M,BOLEY D.On the Performance of Bisecting K-Means and PDDP[C]∥Proc.of the 1st SIAM International Conference on Data Mining.Chicago,USA:2001:1-14.
[5]刘广聪,黄婷婷,陈海南.改进的二分K均值聚类算法[J].计算机应用与软件,2015,32(2):261-263.
[6]VAMSI K B S,SATHEESH P,SUNEEL K R.Comparative Study of K-means and Bisecting K-means Techniques in Wordnet Based Document Clustering[J].International Journal of Engineering and Advanced Technology,2012,1(6):119-234.
[7]张军伟,王念滨,黄少滨,等.二分K均值聚类算法优化及并行研究[J].计算机工程,2011,37(17):23-25.
[8]裘国永,张娇.基于二分K-均值的SVM决策树自适应分类方法[J].计算机应用研究,2012,29(10):3685-3709.
[9]STEINBACH M,KARYPIS G,KUMAR V.A Comparison of Document Clustering Techniques[C]∥Proc.of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Boston,USA,2000:525-526.
[10]LIU X Z,FENG G C.Kernel Bisecting K-Means Cluster- ing for SVM Training Sample Reduction[C]∥Proc.of the 19th International Conference on Pattern Recognition.Tampa,USA,2008:1-4.
[11]戴东波,汤春蕾,熊赟.基于整体和局部相似性的序列聚类算法[J].软件学报,2010,21(4):702-717.

Related Articles 15

[1]	CHAI Hui-min, ZHANG Yong, FANG Min. Aerial Target Grouping Method Based on Feature Similarity Clustering [J]. Computer Science, 2022, 49(9): 70-75.
[2]	LU Chen-yang, DENG Su, MA Wu-bin, WU Ya-hui, ZHOU Hao-hao. Federated Learning Based on Stratified Sampling Optimization for Heterogeneous Clients [J]. Computer Science, 2022, 49(9): 183-193.
[3]	YU Shu-hao, ZHOU Hui, YE Chun-yang, WANG Tai-zheng. SDFA:Study on Ship Trajectory Clustering Method Based on Multi-feature Fusion [J]. Computer Science, 2022, 49(6A): 256-260.
[4]	MAO Sen-lin, XIA Zhen, GENG Xin-yu, CHEN Jian-hui, JIANG Hong-xia. FCM Algorithm Based on Density Sensitive Distance and Fuzzy Partition [J]. Computer Science, 2022, 49(6A): 285-290.
[5]	CHEN Jing-nian. Acceleration of SVM for Multi-class Classification [J]. Computer Science, 2022, 49(6A): 297-300.
[6]	Ran WANG, Jiang-tian NIE, Yang ZHANG, Kun ZHU. Clustering-based Demand Response for Intelligent Energy Management in 6G-enabled Smart Grids [J]. Computer Science, 2022, 49(6): 44-54.
[7]	CHEN Jia-zhou, ZHAO Yi-bo, XU Yang-hui, MA Ji, JIN Ling-feng, QIN Xu-jia. Small Object Detection in 3D Urban Scenes [J]. Computer Science, 2022, 49(6): 238-244.
[8]	XING Yun-bing, LONG Guang-yu, HU Chun-yu, HU Li-sha. Human Activity Recognition Method Based on Class Increment SVM [J]. Computer Science, 2022, 49(5): 78-83.
[9]	ZHU Zhe-qing, GENG Hai-jun, QIAN Yu-hua. Line-Segment Clustering Algorithm for Chemical Structure [J]. Computer Science, 2022, 49(5): 113-119.
[10]	ZHANG Yu-jiao, HUANG Rui, ZHANG Fu-quan, SUI Dong, ZHANG Hu. Study on Affinity Propagation Clustering Algorithm Based on Bacterial Flora Optimization [J]. Computer Science, 2022, 49(5): 165-169.
[11]	ZUO Yuan-lin, GONG Yue-jiao, CHEN Wei-neng. Budget-aware Influence Maximization in Social Networks [J]. Computer Science, 2022, 49(4): 100-109.
[12]	YANG Xu-hua, WANG Lei, YE Lei, ZHANG Duan, ZHOU Yan-bo, LONG Hai-xia. Complex Network Community Detection Algorithm Based on Node Similarity and Network Embedding [J]. Computer Science, 2022, 49(3): 121-128.
[13]	HAN Jie, CHEN Jun-fen, LI Yan, ZHAN Ze-cong. Self-supervised Deep Clustering Algorithm Based on Self-attention [J]. Computer Science, 2022, 49(3): 134-143.
[14]	PU Shi, ZHAO Wei-dong. Community Detection Algorithm for Dynamic Academic Network [J]. Computer Science, 2022, 49(1): 89-94.
[15]	ZHANG Ya-di, SUN Yue, LIU Feng, ZHU Er-zhou. Study on Density Parameter and Center-Replacement Combined K-means and New Clustering Validity Index [J]. Computer Science, 2022, 49(1): 121-132.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Bisecting K-means Clustering Method Based on Cohesion and Coupling

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0