计算机科学 ›› 2010, Vol. 37 ›› Issue (11): 234-238.

• 人工智能 • 上一篇    下一篇

一种基于聚类集成技术的混合型数据聚类算法

罗会兰,危辉   

  1. (江西理工大学信息工程学院 赣州341000);(复旦大学计算机科学技术学院上海市智能信息处理重点实验室 上海200433)
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家973项目(No. 2010CB327900),国家自然科学基金(No. 60303007),上海科技发展 基金(No. 08511501703),上海市智能信息处理重点实验室开放课题(No. IIPL-09-009)资助。

Clustering Algorithm for Mixed Data Based on Clustering Ensemble Technique

LUO Hui-lan,WEI Hui   

  • Online:2018-12-01 Published:2018-12-01

摘要: 提出了一种基于集成技术和谱聚类技术的混合数据聚类算法CBEST。它利用聚类集成技术产生混合数据间的相似性,这种相似性度量没有对数据特征值分布模型做任何的假设。基于此相似性度量得到的待聚类数据的相似性矩阵,应用谱聚类算法得到混合数据聚类结果。大量真实和人工数据上的实验结果验证了CBEST的有效性和它对噪声的鲁棒性。与其它混合数据聚类算法的比较研究也证明了CBEST的优越性能。CBEST还能有效融合先验知识,通过参数的调节来设置不同属性在聚类中的权重。

关键词: 聚类集成,混合型数据,相似性度量

Abstract: A clustering algorithm based on ensemble and spectral technique named CBEST that works well for data with mixed numeric and categorical features was presented. A similarity measure based on clustering ensemble was adopted to define the similarity between pairs of objects,which makes no assumptions of the underlying distributions of the fcature values. A spectral clustering algorithm was employed on the similarity matrix to extract a partition of the data. The performance of CREST was studied on artificial and real data sets. Results demonstrate the effectiveness of this algorithm in clustering mixed data tasks and its robustness to noise. Comparisons with other related clustering schemes illustrate the superior performance of this approach. Moreover, CREST can infuse prior knowledge effectively to set the weights of different features in clustering.

Key words: Clustering ensemble, Mixed data, Similarity measure

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!