计算机科学 ›› 2015, Vol. 42 ›› Issue (Z11): 55-57.

• 智能计算 • 上一篇    下一篇

基于维度属性距离的混合属性近邻传播聚类算法

黄德才,钱潮恺   

  1. 浙江工业大学计算机科学与技术学院 杭州310023,浙江工业大学计算机科学与技术学院 杭州310023
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受水利部公益性行业科研专项(201401044)资助

Mixed Data Affinity Propagation Clustering Algorithm Based on Dimensional Attribute Distance

HUANG De-cai and QIAN Chao-kai   

  • Online:2018-11-14 Published:2018-11-14

摘要: 针对近邻传播聚类算法不能处理混合属性数据集的问题,提出了一种新的距离度量测度,并将其应用到近邻传播聚类算法中,提出了一种基于维度属性距离的混合属性近邻传播聚类算法。与传统聚类算法不同的是,该算法不需要计算虚拟的中心点,同时考虑了数据集整体分布对聚类结果的影响。将算法在UCI数据库的2个混合属性数据集上进行验证,同时对比了经典的K-Prototypes算法以及K-Modes算法。实验结果表明,改进后的算法具有更好的聚类质量以及执行效率,算法的优越性得到了验证。

关键词: 属性距离,混合属性,近邻传播,聚类

Abstract: A new distance measurement was raised because the affinity propagation cannot cluster mixed data sets.And this distance measurement was successfully applied into affinity propagation clustering algorithm.This new algorithm doesn’t need to calculate the virtual cluster center points,and also considers the effect of diversity of whole data set.This algorithm was validated through two UCI data sets.And the clustering performance is better than K-Prototypes and K-Modes in both clustering entropy and execution efficiency.

Key words: Attribute distance,Mixed attributes,Affinity propagation,Clustering

[1] Tan P N,Steinbach M,Kumar V.数据挖掘导论[M].范明,范宏建,等译.北京:人民邮电出版社,2011
[2] Kaufan L,Rousseeuw P J.Finding Groups in Data:An Introduction to Cluster Analysis[M].New York:John Wiley&Sons,1990
[3] 黄德才,沈仙桥,陆亿红.混合属性数据流的二重k近邻聚类算法[J].计算机科学,2013,0(10):226-230
[4] Huang Zhe-xue.Clustering Large Data Sets with Mixed Numericand Categorical Values[C]∥Proceedings of PAKDD’97.Singapore,World Scientific,1997:21-35
[5] Chatzis S P.A Fuzzy C-Means-Type Algorithm for Clustering of Data with Mixed Numeric and Categorical Attributes Employing a Probabilistic Dissimilarity Functional[J].Expert Systems with Applications,2011,38(7):8684-8689
[6] 白天,冀进朝,何加亮,等.混合属性数据聚类的新方法[J].吉林大学学报(工学版),2013,43(1):130-134
[7] Frey B J,Dueck D.Clustering by passing messages between data points[J].Science,2007,315(5814):972-976
[8] Qian Y,Yao F,Jia S.Band selection for hyperspectral imagery using affinity propagation[J].IET Computer Vision,2009,3(4):213-222
[9] Li G,Guo L,Liu T.Grouping of brain MR images via affinity propagation[C]∥IEEE International Symposium on Circuits and Systems,2009(ISCAS 2009).IEEE,2009:2425-2428
[10] Dueck D,Frey B J,Jojic N,et al.Constructing treatment portfolios using affinity propagation[M]∥Research in Computational Molecular Biology.Springer Berlin Heidelberg,2008:360-371
[11] Sumedha M L,Weigt M.Unsupervised and semi-supervisedclustering by message passing:soft-constraint affinity propagation[J].The European Physical Journal B-Condensed Matter and Complex Systems,2008,66(1):125-135
[12] 刘晓楠,尹美娟,李明涛,等.面向大规模数据的分层近邻传播聚类算法[J].计算机科学,2014,41(3):185-188
[13] Furtlehner C,Sebag M,Zhang X.Scaling analysis of affinity propagation[J].Physical Review E,2010,81(6):066102
[14] Zhang X,Furtlehner C,Sebag M.Data streaming with affinity propagation[M]∥Machine Learning and Knowledge Discovery in Databases.Springer Berlin Heidelberg,2008:628-643
[15] 张建朋,陈福才,李邵梅,等.基于密度与近邻传播的数据流聚类算法[J].自动化学报,2014,40(2):277-288
[16] 王开军,张军英,李丹,等.自适应仿射传播聚类[J].自动化学报,2007,33(12):1242-1246

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!