计算机科学 ›› 2016, Vol. 43 ›› Issue (12): 209-212.doi: 10.11896/j.issn.1002-137X.2016.12.038

• 数据挖掘 • 上一篇    下一篇

密度自适应的半监督谱聚类算法

周海松,黄德才   

  1. 浙江工业大学计算机科学与技术学院 杭州310023,浙江工业大学计算机科学与技术学院 杭州310023
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受水利部公益性行业科研专项(201401044)资助

Density Self-adaption Semi-supervised Spectral Clustering Algorithm

ZHOU Hai-song and HUANG De-cai   

  • Online:2018-12-01 Published:2018-12-01

摘要: 谱聚类是一种新兴的聚类算法,数据点间的相似度定义对其聚类效果起着至关重要的作用。传统的谱聚类算法通常利用高斯核函数作为相似度函数,但是对于多密度的数据往往不能取得良好的效果。在定义新的相似度函数的基础上,提出了一种密度自适应的半监督聚类算法。该算法结合半监督聚类的成对约束理论,利用先验信息对样本点之间的相似度进行自适应调整,提高了聚类的精度。该算法在人工数据集和真实数据集上的仿真实验都取得了良好的效果。

关键词: 密度,半监督,谱聚类

Abstract: As an emerging clustering algorithm,the similarity definition of spectral clustering between data points plays an important role in its clustering results.Traditional spectral clustering algorithms typically use gaussian kernel function to be similarity function,but it doesn’t make great effects on multidimensional data.On the basis of defining the new similarity function, a density self-adaption semi-supervised clustering algorithm was put forward which is sensitive with density.Combining with constraint theory in pairs of the semi-supervised clustering,the algorithm makes adaptations on similarity between sample points by using priori information,thus improving the accuracy of data.The algorithm achieves good results both in synthetic datasets and real-world datasets.

Key words: Density,Semi-supervised,Spectral clustering

[1] Tang Ying,Hu Rui-fei,Yin Guo-fu.Adapted DBSCAN withmulti-threshold[J].Journal of Computer Applications,2008,28(3):745-748(in Chinese) 谭颖,胡瑞飞,殷国富.多密度阈值的DBSCAN改进算法[J].计算机应用,2008,28(3):745-748
[2] Huang Tian-qiang,Yu Yang-qiang,Qin Xiao-lin.Semi-supervised clustering for complicated data[J].Control and Decision,2010,25(1):14-19(in Chinese) 黄添强,余养强,秦小麟.结构复杂数据的半监督聚类[J].控制与决策,2010,25(1):14-19
[3] Ng A Y,Jordan M I,Weiss Y.On spectral clustering:analysis and analgorithm[C]∥Proceeding of NIPS.2002:849-856
[4] Zelnik-Manor L,Perona P.Self-tuning spectral clustering[C]∥Proceeding of NIPS.2005:1601-1608
[5] Liu Xin-yue,Li Jing-wei,Yu Hong,et al.Adaptive SpectralClustering Based on Shared Nearest Neighbors[J].Journal of Chinese Computer Systems,2011(9):1876-1880(in Chinese) 刘馨月,李静伟,于红,等.基于共享近邻的自适应谱聚类[J].小型微型计算机系统,2011(9):1876-1880
[6] Wang Ling,Bo Lie-feng,Jiao Li-cheng.Density-Sensitive Semi-Supervised Spectral Clustering[J].Journal of Software,2007,18(10):2412-2422(in Chinese) 王玲,薄列峰,焦李成.密度敏感的半监督谱聚类[J].软件学报,2007,8(10):2412-2422
[7] Wang Na,Li Xia.Active Semi-supervised Spectral ClusteringBased on Pairwise Constraints[J].Acta Electronica Sinica,2010(1):172-176(in Chinese) 王娜,李霞.基于监督信息特性的主动半监督谱聚类算法[J].电子学报,2010(1):172-176
[8] Wagstaff K,Cardie C,Rogers S,et al.Constrained K-meansclustering with background knowledge[C]∥Proc.of the 18th Int’l Conf.on Machine Learning.2001
[9] Klein D,Kamvar S D,Manning C D.From instance-level con-straints to space-level constraints:Making the most of prior knowledge in data clustering[C]∥Proc.of the 19th Int’l Conf.on Machine Learning.2002
[10] Zhang Wei-jiao,Liu Chun-huang,Li Fang-yu.Method of Quality Evaluation for Clustering[J].Computer Engineering,2005,1(20):10-12(in Chinese) 张惟皎,刘春煌,李芳玉.聚类质量的评价方法[J].计算机工程,2005,1(20):10-12

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!