计算机科学 ›› 2013, Vol. 40 ›› Issue (8): 191-195.

• 人工智能 • 上一篇    下一篇

一种快速、鲁棒的有限高斯混合模型聚类算法

胡庆辉,丁立新,陆玉靖,何进荣   

  1. 武汉大学软件工程国家重点实验室 武汉430072;武汉大学软件工程国家重点实验室 武汉430072;桂林航天工业学院信息工程系 桂林541004;武汉大学软件工程国家重点实验室 武汉430072
  • 出版日期:2018-11-16 发布日期:2018-11-16
  • 基金资助:
    本文受国家自然科学基金(60975050),中央高校基本科研业务费专项基金(6081014),武汉大学研究生自主科研项目(2012211020209)资助

Rapid Robust Clustering Algorithm for Gaussian Finite Mixture Model

HU Qing-hui,DING Li-xin,LU Yu-jing and HE Jin-rong   

  • Online:2018-11-16 Published:2018-11-16

摘要: 有限混合模型聚类是一种基于概率模型的有效聚类方法。针对高斯混合模型的聚类算法,分别对模型的成分混合系数及样本所属成分的概率系数施加熵惩罚算子,实现对模型成分数的两级控制,快速消除无效成分,使算法能在很少的迭代次数内收敛到确定解。传统算法对初始值(成分数目c需事先指定)的设置非常敏感,容易导致EM算法陷入局部最优解或收敛到解空间的边界,而文中的算法对初始值的设定没有特殊的要求,实验证明其具有很好的鲁棒性。

关键词: 高斯混合模型,聚类,信息熵,EM算法

Abstract: Finite mixture model is an effective clustering method based on probability model. Aiming at the clustering algorithm of Gaussian mixture model.This paper imposed entropy penalized operators on the mixed coefficients of components and the labels of samples respectively,which brings to two levels controls for the number of components and rapid reduction of the illegitimate ones.Thus the algorithm converges to exact solutions with only a few iterations.Since the traditional algorithm is very sensitive to the initial values (for example,the number of components must be set in advance),which often leads to the EM algorithm to fall into local optima or converges to the boundary of the solution space,the new algorithm of this paper is very robust and has no special demands for the initializations,just testified by the experiments.

Key words: Gaussian finite mixture model,Clustering,Entropy,EM algorithm

[1] Yang M-S,Lai C-Y,Lin C-Y.A robust EM clustering algorithm for Gaussian mixture models[J].Pattern Recognition,2012(5):3950-3961
[2] Andrews J L,McNicholas P D,Subedi S.Model-based classification via mixtures of multivariate t-distributions[J].Computational Statistics and Data Analysis,2010(6):520-529
[3] Peel D,McLachlan G J.Robust Mixture Modeling using the t Distribution[J].Statistics and Computing,2000(10):339-348
[4] Sun Jian-yong,Garibaldi J M.Robust mixture clustering using Pearson type VII distribution[J].Pattern Recognition Letters,2010(7):1-8
[5] 朱峰,宋余庆,陈健美.基于椭球等高分布混合模型的聚类方法[J].江苏大学学报:自然科学版,2011(6):701-705
[6] Bouguila N,ElGuebaly W.Discrete data clustering using finite mixture models[J].Pattern Recognition,2009(1):33-42
[7] Figueiredo M A T,Jain A K.Unsupervised learning of finite Mixture models[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2002(24):381-396
[8] Lin T I.Robust mixture modeling using multivaritate skew t distribution[J].Statistics and Computing,2010(20):343-356
[9] 余成文,郭雷.基于有限混合多变量t分布的鲁棒聚类算法[J].计算机科学,2007(5):190-193
[10] Bouguila N,Ziou D,Vaillancourt J.Unsupervised learning of a finite mixture model based on the Dirichlet distribution and its application[J].IEEE Transactions on Image Processing,2004(11):1533-1543
[11] Biernacki C,Celeux G,Govaert G.Choosing starting values for the EM algorithm for getting the highest likelihood inmultivaria-te Gaussian mixture models[J].Computational Statistic&Data Analysis,2003(41):561-575
[12] Reddy K,Chiang H D,Rajaratnam B.TRUST-TECH-based expectation maximization for learning finite mixture models[J].IEEE Transactionson Pattern Analysis and Machine Intelligence,2008(30):1146-1157
[13] Richardson P,Green J.On Bayesian analysis of mixtures with an unknown number of components[J].Journal of the Royal Statistical Society-SeriesB,1997(30):731-758
[14] Reaven G M,Miller R G.An attempt to define the nature of chemical diabetes using a multidimensional analysis[J].Diabetologia,1979(16):17-24

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!