计算机科学 ›› 2017, Vol. 44 ›› Issue (5): 276-279.doi: 10.11896/j.issn.1002-137X.2017.05.050

• 人工智能 • 上一篇    下一篇

基于秩约束密度敏感距离的自适应聚类算法

任永功,刘洋,赵月   

  1. 辽宁师范大学计算机与信息技术学院 大连116029,辽宁师范大学计算机与信息技术学院 大连116029,辽宁师范大学计算机与信息技术学院 大连116029
  • 出版日期:2018-11-13 发布日期:2018-11-13
  • 基金资助:
    本文受国家自然科学基金项目(F020806),辽宁省高等学校优秀人才支持计划项目(LR2015033),辽宁省科技计划项目(2013405003),大连市科技计划项目(2013A16GX116)资助

Adaptive Clustering Algorithm Based on Rank Constraint Density Sensitive Distance

REN Yong-gong, LIU Yang and ZHAO Yue   

  • Online:2018-11-13 Published:2018-11-13

摘要: 传统的聚类算法一般使用欧氏距离获得数据的相似矩阵,在处理一些较复杂的数据时,欧氏距离由于不能反映全局一致性,因此无法有效地描述出数据点的实际分布。提出了一种基于秩约束密度敏感距离(Rank Constraints Density Sensitive Distance,RCDSD) 的自适应聚类算法。该方法首先引入密度敏感距离的相似性度量得到相似矩阵,有效地扩大了不同类数据点之间的距离,缩小了同类数据点间的距离,从而解决了传统聚类算法使用欧氏距离作为相似性度量导致聚类结果出现偏差的弊端;其次,在相似矩阵的拉普拉斯矩阵上施加秩约束,使相似矩阵的连通区域数等于聚类数,直接将数据点划分到正确的类中,得到最终的聚类结果,而不需要执行k-means或其它离散化程序。在人工仿真数据集和真实数据集上进行了大量实验,结果表明,所提算法得到了准确的聚类结果,并提高了聚类性能。

关键词: 密度敏感,相似矩阵,秩约束,聚类

Abstract: The traditional clustering algorithms generally use Euclidean distance to acquire the similar matrix.In some more complex data processing,Euclidean distance doesn’t have the ability of describing the characters of data because it can’t reflect the global consistency.An adaptive clustering algorithm based on rank constraint density sensitive distance (RCDSD) was proposed in this paper.First,a density sensitive distance similarity measure is introduced to acquire the similar matrix which enlarges the distance between the different classes and reduces the distance between the same classes effectively,so as to solve the disadvantages of clustering results deviation of the traditional clustering algorithm based on Euclidean distance.Second,the rank constraint is imposed to the Laplacian matrix of the similarity matrix,thus the number of connected area of the similar matrix is equal to the number of clustering,and the data can be directly divided into the right class and the algorithm can take the final clustering result,while the algorithm does not need to perform k-means or other discrete procedure.Experimental results show that the approach can obtain accurate clustering results and improve the clustering performance on both artificial simulation data sets and real data sets.

Key words: Density sensitive,Similarity matrix,Rank constraints,Clustering

[1] BERKHIN P.Survey of clustering data mining techniques[J].Grouping Multidimensional Data,2002,43(1):25-71 .
[2] HU W Y,SUN Z H,WU Y J.Study of Sampling method on Data Mining and Stream Mining [J].Journal of Computer Research and Development,2011,48(1):45-54.(in Chinese) 胡文瑜,孙志挥,吴英杰.数据挖掘取样方法研究[J].计算机研究与发展,2011,48(1):45-54.
[3] LIU D Y,CHEN H L,QI H,et al.Advance in Spatiotemporal Data Mining[J].Journal of Computer Research and Development,2013,50(2):225-239.(in Chinese) 刘大有,陈慧灵,齐红,等.时空数据挖掘研究进展[J].计算机研究与发展,2013,50(2):225-239.
[4] HUANG Z H,XIANG Y,ZHANG B,et al.An Efficient Method for K-means Clustering[J].Pattern Recognition and Artificial Intelligence,2010,23(4):516-521.(in Chinese) 黄震华,向阳,张波,等.一种进行K-Means聚类的有效方法[J].模式识别与人工智能,2010,23(4):516-521.
[5] SARMA T H,VISWANATH P,REDDY B E.A hybrid ap-proach to speed-up the k-means clustering method[J].International Journal of Machine Learning & Cybernetics,2013,4(2):107-117.
[6] YU H T,JIA M J,WANG H Q,et al.K-means lustering Algorithm Based on Artificial Fish Swarm[J].Computer Science,2012,39(12):60-64.(in Chinese) 于海涛,贾美娟,王慧强,等.基于人工鱼群的优化K-means聚类算法[J].计算机科学,2012,39(12):60-64.
[7] WU J,CUI Z M,SHI Y J,et al.Local Density-based Similarity Matrix Construction for Spectral Clustering[J].Journal on Communication,2013(3):14-22.(in Chinese) 吴健,崔志明,时玉杰,等.基于局部密度构造相似矩阵的谱聚类算法[J].通信学报,2013(3):14-22.
[8] HUANG L,LI R,CHEN H,et al.Detecting network com-munities using regularized spectral clustering algorithm[J].Artificial Intelligence Review,2014,41(4):579-594.
[9] SHI J,MAILIK J.Normalized cuts and image segmentation[J].IEEE transactions on Pattern Analysis and Machine Intelligence,2000,22(8):888-905.
[10] LIANG H,XUEPING G U.Black-Start Network PartitioningBased on Spectral Clustering [J].Power System Technology,2013,37(2):372-377.
[11] PAN X Y,LIU F,JIAO L C.Density Sensitive Based Multi-Agent Evolutionary Clustering algorithm [J].Journal of Software,2010,21(10):2420-2431.(in Chinese) 潘晓英,刘芳,焦李成.密度敏感的多智能体进化聚类算法[J].软件学报,2010,21(10):2420-2431.
[12] LIU Y,CAI D,LI C.Density Sensitive Hashing [J].IEEE Transactions on Cybernetics,2012,44(8):1362-1371.
[13] YANG P,ZHU Q,HUANG B.Spectral clustering with density sensitive similarity function [J].Knowledge-Based Systems,2011,24(5):621-628.
[14] CHUNG F R K.Spectral graph Theory [J].Regional Confe-rence,1997,7(1):158.
[15] MOHAR B.The Laplacian spectrum of graphs[J].Graph Theory,Combinatorics,and Applications,1991,2(7):871-898.
[16] FAK K.On the theorem of weyl concerning eigenvalues of linear transformations[J].Proceedings of National Academy of Scien-ces,1950,35(11):652-655.
[17] BOYD,STEPHEN,VANDENBERGHE,et al.Convex Optimi-zation[M]∥Cambridge University Press.2004:1859.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!