不确定域环境下基于DKC值改进的K-means聚类算法

计算机科学 ›› 2013, Vol. 40 ›› Issue (4): 181-184.

不确定域环境下基于DKC值改进的K-means聚类算法

任培花,王丽珍

山西大同大学数学与计算机科学学院大同037009;山西大同大学教育科学与技术学院大同037009

出版日期:2018-11-16 发布日期:2018-11-16
基金资助:
本文受2011年山西省科技基础条件平台建设“大同地区科学数据共享服务平台”项目(2011091002-0102)资助

Improved K-means Clustering Algorithm Based on DKC in Uncertain Region Environment

REN Pei-hua and WANG Li-zhen

Online:2018-11-16 Published:2018-11-16

摘要/Abstract

摘要： 提出一种不确定域环境下基于DKC值改进的K-means聚类算法,即U2d-Kmeans。该算法首先考虑到数据对象的不确定性因素,引入不确定域对数据对象进行描述；其次吸取2d-Kmeans的优点,对数据集进行预处理(剔除孤立点),并且采用累积距离的方法确定初始聚类中心,从而避免了随机选取聚类初始点造成聚类不稳定的缺陷；最后经过算法有效性对比实验证明得出,U2d-Kmeans算法比前两种算法更客观、有效。

关键词: 不确定域,DKC值,2d-距离,聚类算法

Abstract: This paper presented an improved K-means clustering algorithm based on DKC in uncertain region environment,namely U2d-Kmeans．Firstly,the algorithm takes uncertainty factors into account of the data object description,then uses new pretreatment method(removing isolated point) of data set and the cumulative distance method of determining the initial clustering center that is mentioned in the 2d-Kmeans algorithm．These methods avoid the defect of clustering instability caused by the random selection of clustering initial point．Finally,comparison experiment of the algorithm proves that the improved U2d-Kmeans is more objective and effective than the other two algorithms.

Key words: Uncertain region,DKC,2d-distance,Clustering algorithm

任培花,王丽珍. 不确定域环境下基于DKC值改进的K-means聚类算法[J]. 计算机科学, 2013, 40(4): 181-184. https://doi.org/

REN Pei-hua and WANG Li-zhen. Improved K-means Clustering Algorithm Based on DKC in Uncertain Region Environment[J]. Computer Science, 2013, 40(4): 181-184. https://doi.org/

参考文献

[1] Han Jia-wei,Kamber M．Data Mining:Concepts and Techniques[M]．Morgan Kaufmann Publishers,2001
[2] 李光宇.基于改进的CLARANS算法在数据挖掘中的研究[J].中南林业科技大学学报,2010,3:142-145
[3] 原福永,张晓彩,罗思标.基于信息熵的精确属性赋权K-means聚类算法[J].计算机应用,2011,1(6):1675-1677
[4] 姚丽娟,罗可,孟颖.一种基于粒子群的聚类算法[J].计算机工程与应用,2012,3
[5] 储岳中,徐波.动态最近邻聚类算法的优化研究[J].计算机工程与设计,2011,2(5):1687-1690
[6] 杨臻.基于2k-距离的孤立点算法研究[J]．福建电脑,2009,2:77-78
[7] 陈福集,蒋芳.基于2d-距离改进的K-means聚类算法研究[J].太原理工大学学报,2012,3(2):114-118
[8] 刘位龙.面向不确定性数据的聚类算法研究[D].济南:山东师范大学,2011
[9] Pfoser D,Jensen C S．Capturing the Uncertainty of Moving-ObjectRepresentations[C]∥Proceedings of the 6th International Symposium on Advances inSpatial Databases．1999:111-132
[10] UCI Machine Learning Repository[DB/OL]．http://archive.ics.uci.edu/ml/,1992-07-16
[11] Ahmad A,Dey L．A K-mean clustering algorithm for mixed numeric and categorical data[J].Data and Knowledge Enginee-ring,2007,3:503-527
[12] 王茜,张鲲鹏.隐私保护数据挖掘算法MASK的改进[J].重庆理工大学学报:自然科学版,2012,26(6):63-66

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed