计算机科学 ›› 2015, Vol. 42 ›› Issue (5): 57-61.doi: 10.11896/j.issn.1002-137X.2015.05.012
方 玲,陈松灿
FANG Ling and CHEN Song-can
摘要: 传统的聚类方法,如k均值和模糊c均值,通常并不区分数据特征对聚类的不同贡献或重要度,因此在面对高维数据聚类时,常会导致偏低的聚类性能,这归咎于聚类时未考虑高维数据特征间所存在的高度相关性或冗余。而通过在聚类时为每一特征引入权重并通过聚类目标的优化,不仅能自动获得对应的权重,而且也获得了聚类性能的提升。尽管如此,但无监督获取的特征权重未必吻合用户所期望的特征间的相对重要性(或偏好)。因此尝试利用用户给定的实际偏好设计出能反映特征偏好的聚类方法,其将现有独立于个体聚类的全局加权型偏好聚类方法拓展至聚类依赖的局部特征加权型方法,由此弥补了前者的不足,提升了偏好聚类算法的性能。
[1] Asuncion A,Newman D.UCI machine learning repository[Z].2007 [2] Wang J,Wang S T,Deng Z H.A novel text clustering algorithm based on feature weighting distance and soft subspace learning[J].Jisuanji Xuebao (Chinese Journal of Computers),2012,35(8):1655-1665 [3] Andrews J L,McNicholas P D.Variable Selection for Clustering and Classification [J].Journal of classification,2014,1(2):136-153 [4] Sun J,Zhao W,Xue J,et al.Clustering with feature order prefe-rences[J].Intelligent Data Analysis,2010,14(4):479-495 [5] Chen X,Ye Y,Xu X,et al.A feature group weighting method for subspace clustering of high-dimensional data[J].Pattern Recognition,2012,45(1):434-446 [6] Jain A K,Dubes R C.Algorithms for clustering data [M].Prentice-Hall,Inc.,1988 [7] Witten D M,Tibshirani R.A framework for feature selection in clustering [J].Journal of the American Statistical Association,2010,105(490) [8] Banerjee A,Merugu S,Dhillon I S,et al.Clustering with Bregman divergences [J].The Journal of Machine Learning Research,2005,6:1705-1749 [9] Jain A K.Data clustering:50 years beyond K-means [J].Pattern Recognition Letters,2010,31(8):651-666 [10] Bezdek J C.Pattern recognition with fuzzy objective function algorithms[M].Kluwer Academic Publishers,1981 [11] Luo P,Zhan G,He Q,et al.On defining partition entropy by inequalities[J].IEEE Transactions on Information Theory,2007,53(9):3233-3239 [12] Liu Y,Jin R,Jain A K.Boostcluster:Boosting clustering by pairwise constraints[C]∥Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2007:450-459 [13] Gan G,Wu J.A convergence theorem for the fuzzy subspaceclustering (FSC) algorithm [J].Pattern Recognition,2008,41(6):1939-1947 [14] Shi J,Malik J.Normalized cuts and image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(8):888-905 [15] Boyd S P,Vandenberghe L.Convex optimization[M].Cam-bridge university press,2004 [16] Wu M,Schlkopf B.A local learning approach for clustering[C]∥Advances in Neural Information Processing Systems.2006:1529-1536 [17] Bertsekas D P.Nonlinear programming(2nd Edition)[M].1999 [18] Strehl A,Ghosh J.Cluster ensembles--a knowledge reuseframework for combining multiple partitions[J].The Journal of Machine Learning Research,2003,3:583-617 [19] Reynolds D.Gaussian mixture models[M]∥Encyclopedia of Biometrics.Springer US,2009:659-663 [20] McLachlan G J,Peel D.Robust cluster analysis via mixtures of multivariate t-distributions [C]∥Advances in pattern recognition.Springer Berlin Heidelberg,1998:658-666 [21] Reed J W,Potok T E,Patton R M.A multi-agent system for distributed cluster analysis[C]∥Proceedings of Third International Workshop on Software Engineering for Large-Scale Multi-Agent Systems (SELMAS’04) Workshop in conjunction with the 26th International Conference on Software Engineering Edinburgh.Scotland,UK:IEEE,2004:152-155 [22] Coddington P D,Baillie C F.Parallel cluster algorithms [J].Nuclear Physics B-Proceedings Supplements,1991,20:76-79 [23] Makarenkov V,Legendre P.Optimal variable weighting for ultrametric and additive trees and K-means partitioning:Methods and software [J].Journal of Classification,2001,18(2):245-271 [24] Huang J Z,Ng M K,Rong H,et al.Automated variable weighting in k-means type clustering [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(5):657-668 [25] Tsai C Y,Chiu C C.Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm [J].Computational statistics & data analysis,2008,52(10):4658-4672 [26] Wolpert D H,Macready W G.No free lunch theorems for optimization [J].IEEE Transactions on Evolutionary Computation,1997,1(1):67-82 [27] Fu J,Chu S,Han Z,et al.Improved Genetic Algorithm Based on Variable Weighting FCM Clustering Algorithm[C]∥Procee-dings of the 9th International Symposium on Linear Drives for Industry Applications.Volume 2,Springer Berlin Heidelberg,2014:671-677 [28] Chen X,Ye Y,Xu X,et al.A feature group weighting method for subspace clustering of high-dimensional data[J].Pattern Recognition,2012,45(1):434-446 [29] Xiong C,Johnson D,Corso J J.Online active constraint selection for semi-supervised clustering[C]∥ECAI 2012 AIL Workshop.2012 |
No related articles found! |
|