计算机科学 ›› 2023, Vol. 50 ›› Issue (6): 116-121.doi: 10.11896/jsjkx.220800150

• 粒计算与知识发现 • 上一篇    下一篇

基于人工蜂群的三支k-means聚类算法

徐天杰1, 王平心2, 杨习贝1   

  1. 1 江苏科技大学计算机学院 江苏 镇江 212003
    2 江苏科技大学理学院 江苏 镇江 212003
  • 收稿日期:2022-08-15 修回日期:2022-11-25 出版日期:2023-06-15 发布日期:2023-06-06
  • 通讯作者: 王平心(pingxin_wang@hotmail.com)
  • 作者简介:(tianjie_xu@163.com)
  • 基金资助:
    国家自然科学基金(62076111,61773012);江苏省高校自然科学基金(15KJB110004)

Three-way k-means Clustering Based on Artificial Bee Colony

XU Tianjie1, WANG Pingxin2, YANG Xibei1   

  1. 1 School of Computer,Jiangsu University of Science and Technology,Zhenjiang,Jiangsu 212003,China
    2 School of Science,Jiangsu University of Science and Technology,Zhenjiang,Jiangsu 212003,China
  • Received:2022-08-15 Revised:2022-11-25 Online:2023-06-15 Published:2023-06-06
  • About author:XU Tianjie,born in 1996,postgraduate.His main research interests include rough sets and three-way decision.WANG Pingxin,born in 1980,Ph.D,associate professor,master supervisor.His main research interests include matrix analysis,three-way decision,and rough set.
  • Supported by:
    National Natural Science Foundation of China(62076111,61773012) and Natural Science Fund for Colleges and Universities in Jiangsu Province(15KJB110004).

摘要: 聚类在数据挖掘技术中起着至关重要的作用。传统的聚类算法都是硬聚类算法,即对象要么属于一个类,要么不属于一个类,在处理不确定数据时,强制划分会带来决策错误。三支k-means聚类算法可以对边界不确定数据进行更加合理的分类,但仍然存在对初始聚类中心敏感的问题。为解决这一问题,将人工蜂群算法与三支k-means聚类算法相结合,提出了一种基于人工蜂群的三支k-means聚类算法。通过定义类内聚集度函数和类间离散度函数来构造蜜源的适应度函数,引导蜂群向高质量的蜜源进行全局搜索。利用蜂群之间不同角色的相互协作与互换,对数据集进行多次迭代聚类,找到最优的蜜源位置,作为初始聚类中心,并在此基础上交替迭代聚类。实验证明,该方法对聚类结果的性能指标有所提高。在UCI数据集上的实验验证了该算法的有效性。

关键词: 三支k-means聚类算法, 人工蜂群算法, 适应度函数, 初始聚类中心, 蜜源

Abstract: Clustering plays an important role in data mining technology.Traditional clustering algorithms are hard clustering algorithms,namely,objects either belong to a class or do not belong to a class.However,when dealing with uncertain data,forced division will lead to decision-making errors.Three-way k-means clustering algorithm can divide the data into several groups with uncertain boundary reasonably.But it is still sensitive to the initial clustering center.In order to solve this problem,this paper presents a three-way k-means clustering algorithm based on artificial bee colony by integrating artificial bee colony algorithm with three-way k-means clustering algorithm.The fitness function of honey source is constructed by class cohesion function and inter class dispersion function to guide the bee colony to search for high-quality honey source globally.Using the cooperation and exchange of different roles between bee colonies,the data set is clustered repeatedly to find the optimal honey source location,which is used as the initial clustering center,and on this basis,iterative clustering is carried out alternately.Experiments show that this method improves the performance index of clustering results.The effectiveness of the algorithm is verified on UCI data set.

Key words: Three-way k-means, Artificial bee colony algorithm, Fitness function, Initial cluster center, Nectar

中图分类号: 

  • TP391
[1]LU D,TRIPODIS Y,GERSTENFELD L C,et al.Clustering of temporal gene expression data with mixtures of mixed effects models with a penalized likelihood[J].Bioinformatics,2019,35(5):778-786.
[2]KALYANI S,SWARUP K S.Particle swarm optimizationbased k-means clustering approach for security assessment in power systems [J].Expert Systems with Applications,2011,38(9):10839-10846.
[3]SONG L H,ZHANG X F.Improved pixel relevance based on Mahalanobis distance for image segmentation [J].International Journal of Information and Computer Security,2018,10(2/3):237-247.
[4]SUN J G,LIU J,ZHAO L Y.Clustering algorithms research[J].Journal of Software,2008,19(1):48-61.
[5]WU X,KUNMAR V,QUINLAN J R.Top 10 algorithms indata mining [J].Knowledge and Information Systems,2008,14(1):1-37.
[6]LEI X F,XIE K Q,LIN F,et al.An efficient clustering algo-rithm based on local optimality of k-meams [J].Journal of Software,2008,19(7):1683-1692.
[7]WANG P X,SHI H,YANG X B,et al.Three-way k-means:integrating k-means and three-way decision[J].International Journal of Machine Learning and Cybernetics,2019,10:2767-2777.
[8]YAO Y Y.The superiority of three-way decisions in probabilistic rough set models[J].Information Science,2011,181(6):1080-1096.
[9]YAO Y Y.An outline of a theory of three-way decisions[C]//International Conference on Rough Sets and Current Trends in Computing.Berlin,Heidelberg:Springer,2012.
[10]KARABOGA D,BASTURK B.A comparative study of artificial bee colony algorithm [J].Applied Mathematics and Computation,2009,214(1):108-132.
[11]YU H,WANG G Y,YAO Y Y.Current research and future perspectives on decision-theoretic rough sets [J].Journal of Computer,2015,38(8):1608-1639.
[12]YAO Y Y,DENG X F.Sequential three-way decisions withprobabilistic rough set[C]//Proceedings of the 10th IEEE International Conference on Cognitive on Cognitive Informatics & Cognitive Computing.Banff,Canada,2011:120-125.
[13]YU H.A framework of three-way cluster analysis[C]//International Joint Conference on Rough Sets.Cham:Springer,2017.
[14]WANG P X,LIU Q,YANG X B,et al.Three-way Cluster-ring Analysis based on Dynamic Neighborhood[J].Computer Science,2018,45(1):62-66.
[15]WANG P X,YAO Y Y.CE3:A three-way clustering methodbased on mathematical morphology[J].Knowledge-Based Systems,2018,155:54-65.
[16]YU H,ZHANG C,WANG G Y.A tree-based incremental overlapping clustering method using the three-way decision theory[J].Knowledge-based systems,2016,91(C):189-203.
[17]KARABOGA D,OZTURK C.A novel clustering approach:Artificial bee colony(ABC) algorithm [J].Applied Soft Computing,2011,11(1):652-657.
[18]VAN DER MERWE D W,ENGELBRECHT A P.Data clustering using particle swarm optimization[C]//The 2003 Congress on Evolutionary.Canberra:IEEE,2003:215-220.
[19]SHELOKAR P S,JAVARAMAN V K,KULKAMARNI B D.An ant colony approach for clustering [J].Analytica Chimica Acta,2004,509:187-195.
[20]SCHOLKOPF B,PLATT J,HOFMANN T.A local learningapproach for Clustering [C]//International Conference on Neural Information Processing Systems.Vancouver,Canada:MIT Press,2007:1529-1536.
[21]DAVIES D L,BOULDIN D W.A cluster separation measure[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,1979,1(2):224.
[22]BEZDEK J C,PAL N R.Some new indexes of cluster validity [J].IEEE Transactions on Systems,Man,and Cybernetics,1998,28(3):301-315.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!