Computer Science ›› 2017, Vol. 44 ›› Issue (Z11): 442-447.doi: 10.11896/j.issn.1002-137X.2017.11A.094

Previous Articles     Next Articles

UID-DBSCAN Clustering Algorithm of Multi-dimensional Uncertain Data Based on Interval Number

WEI Fang-yuan and HUANG De-cai   

  • Online:2018-12-01 Published:2018-12-01

Abstract: The researches on clustering methods of uncertain data have been paid more and more attention,among them,the UIDK-means algorithm and U-PAM algorithm inherit the partition-based algorithm defects that can not identify any shape clusters and is sensitive to noise.FDBSCAN algorithm assumes that the probability distribution function or probability density function of uncertain data is known,however this information is hard to acquire.For the shortage of the above algorithms,a new multi-dimensional uncertain data clustering algorithm namely UID-DBSCAN based on interval numbers was proposed.It uses interval data combined with statistic information to describe uncertain data reaso-nably.And it utilizes the intervals distance function of low computing complexity to measure the similarity of different uncertain data.The concepts of interval density,interval density-reachable and interval density connected were firstly proposed and applied to expand clusters.Meanwhile in order to realize automatic clustering,combining with statistical features of the data,the parameters of density can be adaptively selected.Experiment results show that UID-DBSCAN algorithm can identify noise effectively,process arbitrary shape clusters and obtain better clustering precision with low computing complexity.

Key words: Uncertain data,Interval number,Clustering algorithm,DBSCAN

[1] 周傲英,金澈清,王国仁,等.不确定性数据管理技术研究综述[J].计算机学报,2009,2(1):1-16.
[2] 任世锦.基于区间数的不确定性数据挖掘及其应用研究[D].杭州:浙江大学,2006.
[3] 孙吉贵,刘杰,赵连宇.聚类算法研究[J].软件学报,2008,19(1):48-61.
[4] CHAU M,CHENG R,KAO B,et al.Uncertain Data Mini ng:An Example in Clustering Location Data[C]∥The 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining(PA-KDD 2006).Singapore:Springer-Verlag Berlin Heidelberg,2006:199-204.
[5] NGAI W K,KAO B,CHUI C K,et al.Efficient Clustering of Uncertain Data[C]∥Proceedings of the 22nd IEEE Internatio-nal Conference on Data Mining(ICDM 2006).Hong Kong:IEEE Computer Society,2006:436-445.
[6] YUN C,YANG J.Reducing UK-means to K-means[C]∥Proceedings of the 6th IEEE International Conference on Data Mi-ning(ICDM 2007).Washington:IEEE Computer Society,2007:483-488.
[7] GULLO F,POINT G,TAGAERLLI A.Clustering UncertainData Via K-medoids[C]∥Proceedings of the 2nd International Confe-rence on Scalable Uncertainty Management.Naples:Springer-Verlag Berlin Heidelberg,2008:229-242.
[8] KRIEGEL H P,PFEIFLE M.Density-based clustering of uncertain data[C]∥The 11th ACM SIGKDD International Confe-rence on Knowledge Discovery in Data Mining.Chicago:Illinois,2005:672-677.
[9] 许华杰,李国徽,杨宾,等.基于密度的不确定性数据概率聚类[J].计算机科学,2009,6(5):68-71.
[10] 胡春安,范丽文,毛伊敏.HPDBSCAN:高效的不确定数据处理算法[J].计算机工程与设计,2013,4(3):1044-1049.
[11] WANG H M,WANG Y Y,WAN S T.A Density-based Clustering Algorithm For Uncertain Data[C]∥Proceedings of International Conference on Computer Science and Electronics Engineering(ICCSEE 2012).Hangzhou:IEEE Computer Society,2012:102-105.
[12] ERDEM A,GNDEM T .M-FDBSCAN:A multicore density-based uncertain data clustering algorithm[J].Turkish Journal of Electrical Engineering & Computer Sciences,2014,2(1):143-154.
[13] JIANG B,PEI J,TAO Y F,et al.Clustering Uncertain Data Based on Probability Distribution Similarity[J].IEEE Transactions on Knowledge and Data Engineering,2013,25(4):751-763.
[14] 彭宇,罗清华,彭喜元.UIDK-means:多维不确定性测量数据聚类算法[J].仪器仪表学报,2011,2(6):1201-1207.
[15] 何云斌,张志超,万静,等.不确定数据聚类的U-PAM算法和UM-PAM算法的研究[J].计算机科学,2016,3(6):263-269.
[16] 刘秀梅,赵克勤.区间数决策集对分析[M].北京:科学出版社,2014:1-28.
[17] 黄德才.数据仓库与数据挖掘教程[M].北京:清华大学出版社,2016.
[18] 戴阳阳,李朝锋,徐华.初始点优化与参数自适应的密度聚类算法[J].计算机工程,2016,2(1):203-209.
[19] AGGARWAL C C,YU P S.A Survey of Uncertain Data Algorithms and Applications[J].IEEE Transactions on Knowledge and Data Engineering,2009,1(5):609-623.
[20] DAVIES D L,BOULDIN D W.A Cluster Separation Measure[J].Transactions on Pattern Analysis and Machine Intelligence,1979(4):224-227.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!