计算机科学 ›› 2021, Vol. 48 ›› Issue (10): 197-203.doi: 10.11896/jsjkx.200900061
崔国楠1, 王立松1, 康介祥2, 高忠杰2, 王辉2, 尹伟2
CUI Guo-nan1, WANG Li-song1, KANG Jie-xiang2, GAO Zhong-jie2, WANG Hui2, YIN Wei2
摘要: 模糊聚类方法可以更有效地对复杂数据集进行分析,由于模糊聚类算法的种类繁多且聚类结果会随着输入的聚类个数的不同而改变,使得模糊聚类算法产生的结果不准确,因此,要获得准确的聚类结果必须确定模糊聚类个数k。目前已有的研究主要是利用多种模糊聚类有效性指标来确定最优聚类个数k,但是诸如SSD,PBM等模糊聚类指标会随着划分的聚类个数k的增加而单调递减,导致聚类个数k不准确。为此,文中提出了一种结合多目标优化算法的模糊聚类有效性指标(A Validity Index of Fuzzy Clustering Combined with Multi-objective Optimization Algorithm,OSACF),将模糊聚类度量指标与多目标优化算法(Multi-Objective Optimization Algorithm,MOEA)相结合来解决聚类最优个数k的问题。与使用聚类有效性指标不同,OSACF通过建立聚类个数k与聚类度量指标之间的双目标模型并使用MOEA优化该双目标模型来确定最优聚类个数k,避免了聚类有效性指标趋于单调递减的影响。另一方面,OSACF使用形态形似距离替代传统的欧氏距离度量,避免了聚类形状对计算聚类k值的影响。实验结果表明,OSACF结合MOEA得到的最优模糊聚类个数k比已有的聚类有效性指标获得的结果更准确。
中图分类号:
[1]BEZDEK J C,EHRLICH R,FULL W.FCM:The fuzzy c-means clustering algorithm[J].Computers & Geosciences,1984,10(2/3):191-203. [2]ZHANG P Z,ZHANG H Y.A Review of Features and Labels Dimensionality Reduction Methods of Multi Label Data[J].Journal of Chongqing Technology and Business University(Na-tural Science Edition),2020,37(5):23-29. [3]WANG Z H,WANG S Y,DU H.Improved Fuzzy C-meansClustering Algorithm Based on Density-Sensitive Distance[J].Computer Engineering,2021,47(5):88-96,103. [4]GAN G,MA C,WU J.Data clustering:theory,algorithms,and applications[M].Society for Industrial and Applied Mathema-tics,2020. [5]MATHER P,TSO B.Classification methods for remotely sensed data[M].CRC Press,2016. [6]CUI H,ZHANG K,FANG Y,et al.A clustering validity index based on pairing frequency[J].IEEE Access,2017,5:24884-24894. [7]VAIDYA J,SHAFIQ B,BASU A,et al.Differentially privatenaive bayes classification[C]//2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).IEEE,2013,1:571-576. [8]BEZDEK J C.Numerical taxonomy with fuzzy sets[J].Journal of Mathematical Biology,1974,1(1):57-71. [9]BEZDEK J C.Cluster validity with fuzzy sets[J].Journal of Cybernetics,1973,3:58-73. [10]XIE X L,BENI G.A validity measure for fuzzy clustering[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1991,13(8):841-847. [11]FUKUYAMA Y.A new method of choosing the number ofclusters for the fuzzy c-mean method[C]//Proc.5th Fuzzy Syst.Symp.,1989.1989:247-250. [12]ZAHID N,LIMOURI M,ESSAID A.A new cluster-validity for fuzzy clustering[J].Pattern Recognition,1999,32(7):1089-1097. [13]PAKHIRA M K,BANDYOPADHYAY S,MAULIK U.Vali-dity index for crisp and fuzzy clusters[J].Pattern Recognition,2004,37(3):487-501. [14]PAKHIRA M K,BANDYOPADHYAY S,MAULIK U.Astudy of some fuzzy cluster validity indices,genetic clustering and application to pixel classification[J].Fuzzy Sets and Systems,2005,155(2):191-214. [15]WANG R,LAI S,WU G,et al.Multi-clustering via evolutionary multi-objective optimization[J].Information Sciences,2018,450:128-140. [16]ZHANG Y,WANG W,ZHANG X,et al.A cluster validity index for fuzzy clustering[J].Information Sciences,2008,178(4):1205-1218. [17]WU K L,YANG M S.A cluster validity index for fuzzy clustering[J].Pattern Recognition Letters,2005,26(9):1275-1291. [18]MIRJALILI S,JANGIR P,SAREMI S.Multi-objective ant lion optimizer:a multi-objective optimization algorithm for solving engineering problems[J].Applied Intelligence,2017,46(1):79-95. [19]YANG X S.Nature-inspired optimization algorithms[M].Academic Press,2020. [20]GUO Y,WENG G.K-means++ clustering-based active contour model for fast image segmentation[J].Journal of Electronic Imaging,2018,27(6):063013. [21]LI Z,YUAN J,ZHANG W.Fuzzy C-mean algorithm with morphology similarity distance[C]//2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery.IEEE,2009,3:90-94. [22]YANG S,LI K,LIANG Z,et al.A novel cluster validity index for fuzzy c-means algorithm[J].Soft Computing,2018,22(6):1921-1931. [23]CAI X,MEI Z,FAN Z.A decomposition-based many-objective evolutionary algorithm with two types of adjustments for direction vectors[J].IEEE Transactions on Cybernetics,2017,48(8):2335-2348. [24]WANG L,CUI G,ZHOU Q,et al.A multi-clustering method based on evolutionary multiobjective optimization with grid decomposition[J].Swarm and Evolutionary Computation,2020,55:100691. [25]REZAEE B.A cluster validity index for fuzzy clustering[J].Fuzzy Sets and Systems,2010,161(23):3014-3025. [26]DAVIES D L,BOULDIN D W.A cluster separation measure[J].IEEE transactions on Pattern Analysis and Machine Intelligence,1979(2):224-227. |
[1] | 张亚迪, 孙悦, 刘锋, 朱二周. 结合密度参数与中心替换的改进K-means算法及新聚类有效性指标研究 Study on Density Parameter and Center-Replacement Combined K-means and New Clustering Validity Index 计算机科学, 2022, 49(1): 121-132. https://doi.org/10.11896/jsjkx.201100148 |
[2] | 乔颖婧, 高保禄, 史瑞雪, 刘璇, 王朝辉. 融合Tamura纹理特征的改进FCM脑MRI图像分割算法 Improved FCM Brain MRI Image Segmentation Algorithm Based on Tamura Texture Feature 计算机科学, 2021, 48(8): 111-117. https://doi.org/10.11896/jsjkx.200700003 |
[3] | 张天柱, 邹承明. 使用模糊聚类的胶囊网络在图像分类上的研究 Study on Image Classification of Capsule Network Using Fuzzy Clustering 计算机科学, 2019, 46(12): 279-285. https://doi.org/10.11896/jsjkx.190200315 |
[4] | 冀进朝,赵晓威,何飞,胡英慧,白天,李在荣. 基于模糊质心的混合属性数据模糊加权聚类算法 Fuzzy Weighted Clustering Algorithm with Fuzzy Centroid for Mixed Data 计算机科学, 2018, 45(2): 109-113. https://doi.org/10.11896/j.issn.1002-137X.2018.02.019 |
[5] | 邢瑞康, 李成海. 基于直觉模糊集理论的IDS方法研究 Research on Intrusion Detection System Method Based on Intuitionistic Fuzzy Sets 计算机科学, 2018, 45(11A): 344-348. |
[6] | 郑奇斌,刁兴春,曹建军. 结合缺失模式的不完整数据模糊聚类 Fuzzy Clustering Algorithm for Incomplete Data Considering Missing Pattern 计算机科学, 2017, 44(12): 58-63. https://doi.org/10.11896/j.issn.1002-137X.2017.12.011 |
[7] | 耿宗科,王长宾,张振国. 基于模糊c-means与自适应粒子群优化的模糊聚类算法 Fuzzy c-means and Adaptive PSO Based Fuzzy Clustering Algorithm 计算机科学, 2016, 43(8): 267-272. https://doi.org/10.11896/j.issn.1002-137X.2016.08.054 |
[8] | 陈小东,孙力娟,韩崇,郭剑. 基于模糊聚类的数据流概念漂移检测算法 Detecting Concept Drift of Data Stream Based on Fuzzy Clustering 计算机科学, 2016, 43(4): 219-223. https://doi.org/10.11896/j.issn.1002-137X.2016.04.045 |
[9] | 伍大清,郑建国,朱佳俊,孙 莉. 基于人类社交行为的动态多目标优化 Dynamic Multi-objective Particle Swarm Optimization Algorithm Based on Human Social Behavior 计算机科学, 2015, 42(8): 249-252. |
[10] | 冯晨菲,杨燕,王红军,徐英歌,王韬. 一种基于数据相关性的半监督模糊聚类集成方法 Semi-supervised Fuzzy Clustering Ensemble Approach with Data Correlation 计算机科学, 2015, 42(6): 41-45. https://doi.org/10.11896/j.issn.1002-137X.2015.06.009 |
[11] | 郭华峰,赵建民,潘修强. 自适应抑制式模糊C-回归模型算法 Adapting Suppressed Fuzzy C-regression Models Algorithm 计算机科学, 2015, 42(2): 274-276. https://doi.org/10.11896/j.issn.1002-137X.2015.02.057 |
[12] | 董世龙,陈宁江,谭瑛,何子龙,朱莉蓉. 面向云环境的集群资源模糊聚类划分算法的优化 Optimization of Cluster Resource Fuzzy Clustering Partition Algorithm for Cloud Computing 计算机科学, 2014, 41(9): 104-109. https://doi.org/10.11896/j.issn.1002-137X.2014.09.020 |
[13] | 张千,梁鸿,郉永山. 云计算环境下基于模糊聚类的并行调度策略研究 Cloud Parallel Task Scheduling Algorithm Based on Fuzzy Clustering 计算机科学, 2014, 41(8): 75-80. https://doi.org/10.11896/j.issn.1002-137X.2014.08.016 |
[14] | 沈洁,郭莉森. 社会网络的模糊聚类理论及其实践 Social Network’s Fuzzy Cluster Theory and its Applications 计算机科学, 2013, 40(Z11): 63-67. |
[15] | 刘颖,张柏,王爱莲,桑娟,何咏梅. 一种基于半监督集成SVM的土地覆盖分类模型 Ensemble Model with Semisupervised SVM for Remote Sensing Land Cover Classification 计算机科学, 2013, 40(7): 206-210. |
|