计算机科学 ›› 2018, Vol. 45 ›› Issue (7): 190-196.doi: 10.11896/j.issn.1002-137X.2018.07.033
王慧研1,张腾飞1,马福民2
WANG Hui-yan1,ZHANG Teng-fei1,MA Fu-min2
摘要: 粗糙K-means算法中下近似和边界区域权重系数的设置对算法的聚类效果有着重要的影响。传统的粗糙K-means算法及很多改进的粗糙K-means算法对所有类簇的下近似和边界区域设置固定的权重,忽视了簇内数据对象分布差异性的影响。针对这个问题,根据下近似和边界区域的数据对象相对于类簇中心的空间分布情况,提出一种新的基于空间距离自适应权重度量的粗糙K-means算法。该算法在每次迭代过程中,根据每个类簇的下近似和边界区域的数据对象相对于类簇中心的平均距离,综合度量下近似和边界区域对于类簇中心迭代计算的不同重要程度,动态地计算下近似和边界区域的相对权重系数。通过实例验证及实验仿真证明了所提算法的有效性。
中图分类号:
[1]HAN J,KAMBER M.Data mining,concepts and techniques(third edition) [M].California:Morgan Kaufmann Publishers,2011. [2]HARTIGAN J A,WONG M A.A K-Means Clustering Algorithm [J].Journal of the Royal Statistical Society,1979,28(1):100-108. [3]AMORIM R C D.A Survey on Feature Weighting BasedK-Means Algorithms[J].Journal of Classification,2016,33(2):210-242. [4]KHANDARE A,ALVI A S.Survey of Improved k-means Clustering Algorithms:Improvements,Shortcomings and Scope for Further Enhancement and Scalability[C]∥Information Systems Design and Intelligent Applications.New Delhi:Springer,2016:495-503. [5]BEZDEK J C.Fuzzy Mathematics in Pattern Classification[D].Ithaca:Cornell University,1973. [6]BEZDEK J C.Pattern Recognition with Fuzzy Objective function Algorithms [M].New York:Plenum Press,1981. [7]JIANG Z H,LI T T,MIN W F,et al.Fuzzy c-means clustering based on weights and gene expression programming [J].Pattern Recognition Letters,2017,90:1-7. [8]CHEN H P,SHEN X J,LV Y D,et al.A novel automatic fuzzy clustering algorithm based on softpartition and membership information [J].Neurocomputing,2017,236(SI):104-112. [9]KRISHNAPURAM R,KELLER J M.A Possibilistic Approach to Clustering [J].IEEE Transactions on Fuzzy Systems,1993,1(2):98-110. [10]KRISHNAPURAM R,KELLER J M.The Possibilistic C-Means Algorithm:Insights and Recommendations [J].IEEE Transactions on Fuzzy Systems,1996,4(3):385-393. [11]BARNIM,CAPPELLINI V,MECOCCI A.Comments on A Possibilistic Approach to Clustering [J].IEEE Transactions on Fuzzy Systems,1996,4(3):393-396. [12]LINGRAS P,WEST C.Interval Set Clustering of Web Users with Rough K-Means [J].Journal of Intelligent Information Systems,2004,23(1):5-16. [13]PAWLAK Z.Rough Sets[J].International Journal of Computer &Information Sciences,1982,11(5):341-356. [14]MAJI P,PAL S K.RFCM:A Hybrid Clustering AlgorithmUsing Rough and Fuzzy Sets[J].Fundamenta Informaticae,2007,80(4):475-496. [15]MITRA S,BANKA H,PEDRYCZ W.Rough-Fuzzy Collabora-tive Clustering [J].IEEE Transactions on Systems Manand Cybernetics Part B Cybernetics,2006,36(4):795-805. [16]PAUL S,MAJI P.A New Rough-Fuzzy Clustering Algorithmand its Applications [C]∥Proceedings of the Second International Conference on Soft Computing for Problem Solving (SocProS 2012).New Delhi:Springer,2014:1245-1251. [17]SHI J,LEI Y,ZHOU Y,et al.Enhanced rough-fuzzy c-means algorithm with strict rough setsproperties [J].Applied Soft Computing,2016,46:827-850. [18]ZHANG T F,CHEN L,MA F M.A modified rough c-meansclustering algorithm based on hybrid imbalanced measureof distance and density[J].International Journal of Approximate Reasoning,2014,55(8):1805-1818. [19]WANG H,ZHOU M.A refined rough k-means clustering with hybrid threshold [C]∥International Conference on Rough Sets and Current Trends in Computing.Berlin Heidelberg:Springer,2012:26-35. [20]MALYSZKO D,STEPANIUK J.Rough Entropy Based k-Means Clustering [C]∥Rough Sets,Fuzzy Sets,Data Mining and Granular Computing.Berlin Heidelberg:Springer,2009:406-413. [21]PAL S K,SHANKAR B U,MITRA P.Granular computing,rough entropy and object extraction[J].Pattern Recognition Letters,2005,26(16):2509-2517. [22]PETERS G.Rough Clustering Utilizing the Principle of Indiffe-rence [J].Information Sciences,2014,277(2):358-374. [23]WANG X E,HAN D Q,HAN C Z.Selection method for Para-meters of Rough Fuzzy C-Means Clustering Based on Uncertainty Measurement[J].Journal of Xi’an Jiaotong University,2013,47(6):55-60.(in Chinese) 王学恩,韩德强,韩崇昭.采用不确定性度量的粗糙模糊C均值聚类参数获取方法[J].西安交通大学学报,2013,47(6):55-60. [24]PETERS G.Some refinements of rough k-means clustering [J].Pattern Recognition,2006,39(8):1481-1491. [25]BEZDEK J C,PAL N R.Some New Indexes of Cluster Validity [J].IEEE Transactions on System Manand Cybernetics Part B Cybernetics,1988,28(3):301-315. |
[1] | 柴慧敏, 张勇, 方敏. 基于特征相似度聚类的空中目标分群方法 Aerial Target Grouping Method Based on Feature Similarity Clustering 计算机科学, 2022, 49(9): 70-75. https://doi.org/10.11896/jsjkx.210800203 |
[2] | 程富豪, 徐泰华, 陈建军, 宋晶晶, 杨习贝. 基于顶点粒k步搜索和粗糙集的强连通分量挖掘算法 Strongly Connected Components Mining Algorithm Based on k-step Search of Vertex Granule and Rough Set Theory 计算机科学, 2022, 49(8): 97-107. https://doi.org/10.11896/jsjkx.210700202 |
[3] | 许思雨, 秦克云. 基于剩余格的模糊粗糙集的拓扑性质 Topological Properties of Fuzzy Rough Sets Based on Residuated Lattices 计算机科学, 2022, 49(6A): 140-143. https://doi.org/10.11896/jsjkx.210200123 |
[4] | 方连花, 林玉梅, 吴伟志. 随机多尺度序决策系统的最优尺度选择 Optimal Scale Selection in Random Multi-scale Ordered Decision Systems 计算机科学, 2022, 49(6): 172-179. https://doi.org/10.11896/jsjkx.220200067 |
[5] | 陈于思, 艾志华, 张清华. 基于三角不等式判定和局部策略的高效邻域覆盖模型 Efficient Neighborhood Covering Model Based on Triangle Inequality Checkand Local Strategy 计算机科学, 2022, 49(5): 152-158. https://doi.org/10.11896/jsjkx.210300302 |
[6] | 赵亮, 张洁, 陈志奎. 基于双图正则化的自适应多模态鲁棒特征学习 Adaptive Multimodal Robust Feature Learning Based on Dual Graph-regularization 计算机科学, 2022, 49(4): 124-133. https://doi.org/10.11896/jsjkx.210300078 |
[7] | 孙林, 黄苗苗, 徐久成. 基于邻域粗糙集和Relief的弱标记特征选择方法 Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief 计算机科学, 2022, 49(4): 152-160. https://doi.org/10.11896/jsjkx.210300094 |
[8] | 王子茵, 李磊军, 米据生, 李美争, 解滨. 基于误分代价的变精度模糊粗糙集属性约简 Attribute Reduction of Variable Precision Fuzzy Rough Set Based on Misclassification Cost 计算机科学, 2022, 49(4): 161-167. https://doi.org/10.11896/jsjkx.210500211 |
[9] | 王志成, 高灿, 邢金明. 一种基于正域的三支近似约简 Three-way Approximate Reduction Based on Positive Region 计算机科学, 2022, 49(4): 168-173. https://doi.org/10.11896/jsjkx.210500067 |
[10] | 薛占熬, 侯昊东, 孙冰心, 姚守倩. 带标记的不完备双论域模糊概率粗糙集中近似集动态更新方法 Label-based Approach for Dynamic Updating Approximations in Incomplete Fuzzy Probabilistic Rough Sets over Two Universes 计算机科学, 2022, 49(3): 255-262. https://doi.org/10.11896/jsjkx.201200042 |
[11] | 张亚迪, 孙悦, 刘锋, 朱二周. 结合密度参数与中心替换的改进K-means算法及新聚类有效性指标研究 Study on Density Parameter and Center-Replacement Combined K-means and New Clustering Validity Index 计算机科学, 2022, 49(1): 121-132. https://doi.org/10.11896/jsjkx.201100148 |
[12] | 李艳, 范斌, 郭劼, 林梓源, 赵曌. 基于k-原型聚类和粗糙集的属性约简方法 Attribute Reduction Method Based on k-prototypes Clustering and Rough Sets 计算机科学, 2021, 48(6A): 342-348. https://doi.org/10.11896/jsjkx.201000053 |
[13] | 李杉, 许新征. 基于双角度并行剪枝的VGG16优化方法 Parallel Pruning from Two Aspects for VGG16 Optimization 计算机科学, 2021, 48(6): 227-233. https://doi.org/10.11896/jsjkx.200800016 |
[14] | 汤鑫瑶, 张正军, 储杰, 严涛. 基于自然最近邻的密度峰值聚类算法 Density Peaks Clustering Algorithm Based on Natural Nearest Neighbor 计算机科学, 2021, 48(3): 151-157. https://doi.org/10.11896/jsjkx.200100112 |
[15] | 曹林, 于威威. 基于图像分割的自适应窗口双目立体匹配算法研究 Adaptive Window Binocular Stereo Matching Algorithm Based on Image Segmentation 计算机科学, 2021, 48(11A): 314-318. https://doi.org/10.11896/jsjkx.201200264 |
|