Computer Science ›› 2015, Vol. 42 ›› Issue (1): 232-235,267.doi: 10.11896/j.issn.1002-137X.2015.01.051

Previous Articles     Next Articles

Improved AP Algorithm:M-AP Clustering Algorithm

GAN Yue-song, CHEN Xiu-hong and CHEN Xiao-hui   

  • Online:2018-11-14 Published:2018-11-14

Abstract: Affinity propagation(AP) clustering simultaneously considers all data points as potential exemplars.It takes similarity between pairs of data points as input measures,and clusters gradually during the message-passing procedure.But the result of AP clustering algorithm in the data set of complex structure(non-group)is not very good.Therefore,we proposed a new clustering algorithm by adding a merge process on the basis of AP clustering algorithm,called M-AP algorithm which can effectively solve this kind of problem.When the number of samples is very large, the problem of large sample can be effectively solved by using CVM compression algorithm.

Key words: Clustering algorithm,Affinity propagation,Merge-AP,Merge process,CVM compress

[1] Frey B J,Dueck D.Clustering by passing messages between data points[J].Science,2007,5(5814):972-976
[2] Pollard D.Strong consistency of Kmeans clustering[J].Ailnals of Statistics,1981,9(1):135-140
[3] Zhang T,Ramakrishnan R,Livny M.BIRCH.An efficient data clustering method for very large databases[J].Montreal,1996,6(96):103-114
[4] Pal N R,Bezdek J C.On cluster validity for the fuzzy c-means model[J].IEEE Transactions on Fuzzy Systems,1995,3(3):370-379
[5] Tsang I W,Kwok J T,Cheung P M.Core vector machines:fast SVM training on very large data sets[J].Journal of Machine Learning Research,2005,8(6):363-392
[6] Deng Zhao-hong,Choi K S,Chung F L,et al.Enhanced soft sub-space clustering integrating within cluster and between blusterInformation[J].Pattern Recognition,2010,43(3):767-781
[7] Liu Jun,Mohammed J,Carter J,et al.Distance based clustering of CGH data[J].Bioinformatics,2006,22(16):1971-1978
[8] Chen W Y,Song Y,Bai H,et al.Parallel spectral clustering in distributed systems[J].IEEE Transactions onPattern Analysis and Machine Intelligence,2011,33(3):568-586
[9] Wu M,Schlkopf B.A local learning approach for clustering[J].Proc.Conf.Neural Information Processing Systems,2007(1):1529-1536
[10] Papadimitriou C H,Steiglitz K.Combinatorial Optimization:Algorithms and Complexity[M].Dover Publications,1998
[11] Optdigits数据集.https://archive.ics.uci.edu/ml/machine-learning-databases/optdigits/
[12] Iris数据集.http://archive.ics.uci.edu/ml/datasets/Iris
[13] Clean数据集.http://archive.ics.uci.edu/ml/machine-learning-databases/musk/
[14] Soybean数据集.http://archive.ics.uci.edu/ml/datasets/Soybean+(Small)
[15] 孙吉贵,刘杰,赵连宇.聚类算法研究[J].Journal of Software,2008,19(1):48-61
[16] 肖宇,于剑.基于近邻传播算法的半监督聚类[J].Journal of Software,2008,19(11):2803-2813
[17] 牟廉明,詹德川,黎铭,等.基于结构相似性和压缩变换的聚类方法[J].Pattern Recognition and Artificial Intelligence,2011,24(5):637-644
[18] 于剑,程乾生.模糊聚类方法中的最佳聚类数的搜索范围[J].中国科学E辑,2002,2(2):274-280
[19] 杨善林,李永森,胡笑旋,等.K-means算法中的k值优化问题研究[J].系统工程理论与实践,2006,26(2):97-101
[20] 周水庚,周傲英,曹晶.基于数据分区的DBSCAN算法[J].计算机研究与发展,2000(10):1153-1159
[21] Ester M.A density-based algorithm for discovering clusters in large spatial databases with noise[C]∥ Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining,Portland:AAAI Press.1996:226-239
[22] 计华,张化祥,孙晓燕.基于最近邻原则的半监督聚类算法 [J].计算机工程与设计,2011,2(7):2455-2459
[23] 李昆仑,曹铮,曹丽苹.半监督聚类的若干新进展[J].模式识别与人工智能,2009(5):735-742
[24] DBSCAN算法代码.http://wenku.baidu.com/view/47a26ebbba0d4a7302763a9c.html

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75, 88 .
[2] XIA Qing-xun and ZHUANG Yi. Remote Attestation Mechanism Based on Locality Principle[J]. Computer Science, 2018, 45(4): 148 -151, 162 .
[3] LI Bai-shen, LI Ling-zhi, SUN Yong and ZHU Yan-qin. Intranet Defense Algorithm Based on Pseudo Boosting Decision Tree[J]. Computer Science, 2018, 45(4): 157 -162 .
[4] WANG Huan, ZHANG Yun-feng and ZHANG Yan. Rapid Decision Method for Repairing Sequence Based on CFDs[J]. Computer Science, 2018, 45(3): 311 -316 .
[5] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[6] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[7] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[8] LIU Qin. Study on Data Quality Based on Constraint in Computer Forensics[J]. Computer Science, 2018, 45(4): 169 -172 .
[9] ZHONG Fei and YANG Bin. License Plate Detection Based on Principal Component Analysis Network[J]. Computer Science, 2018, 45(3): 268 -273 .
[10] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99, 116 .