Computer Science ›› 2017, Vol. 44 ›› Issue (1): 65-70.doi: 10.11896/j.issn.1002-137X.2017.01.012

Previous Articles     Next Articles

Improved Multi-view Clustering Ensemble Algorithm

DENG Qiang, YANG Yan and WANG Hao   

  • Online:2018-11-13 Published:2018-11-13

Abstract: In recent years,data mining and machine learning algorithms for big data become increasingly important.In the clustering,with the appearance of multi-view data,multi-view clustering has become an important clustering method.However,many existing multi-view clustering algorithms are easily affected by parameter setting and dataset itself,so the clustering results are usually unstable.To overcome this problem,we presented a new multi-view clustering ensemble algorithm based on the multi-view K-means clustering algorithm in this paper.This algorithm uses ensemble technique to improve the multi-view K-means algorithm performance,increasing the accuracy,robustness,and stability of clustering results.It is well known that one single computer cannot process too much data,because one computer has the limited computation resources.To improve the efficiency of multi-view clustering,we implemented a distributed multi-view clustering ensemble algorithm based on distributed processing technology.Experimental results show that the proposed approach has higher efficiency when processing large dataset,and it is suitable for multi-view clustering in big data environment.

Key words: Multi-view clustering,Clustering ensemble,Distributed Computation,Parallelization

[1] KUMAR A,DAUM H.A co-training approach for multi-view spectral clustering[C]∥Proceedings of the 28th International Conference on Machine Learning (ICML-11).2011:393-400.
[2] Bickel S,Scheffer T.Multi-View Clustering[C]∥ICDM.2004:19-26.
[3] KUMAR A,RAI P,DAUME H.Co-regularized multi-view spec-tral clustering[M]∥Advances in Neural Information Processing Systems.2011:1413-1421.
[4] CAI X,NIE F,HUANG H.Multi-view k-means clustering on big data[C]∥Proceedings of the Twenty-Third international Joint Conference on Artificial Intelligence.AAAI Press,2013:2598-2604.
[5] TZORTZIS G,LIKAS A.Kernel-based weighted multi-view clu-stering[C]∥Proceedings of the 12th IEEE International Con-ference on Data Mining (ICDM).2012:675-684.
[6] XIIE X,SUN S.Multi-view clustering ensembles[C]∥Procee-dings of the IEEE 2013 International Conference on Machine Learning and Cybernetics (ICMLC).2013:51-56.
[7] MIZAEI H.A novel multi-view agglomerative clustering algo-rithm based on ensemble of partitions on different views[C]∥ 2010 20th International Conference on Pattern Recognition (ICPR).2010:1007-1010.
[8] STREHL A,GHOSH J.Cluster ensembles--a knowledge reuse framework for combining multiple partitions[J].The Journal of Machine Learning Research,2003,3:583-617.
[9] IAM-ON N,BOONGOEN T,GARRETT S.Refining pairwise similarity matrix for cluster ensemble problem with cluster relations[M]∥Discovery Science.Springer Berlin Heidelberg,2008:222-233.
[10] DEAN J,GHEMAWAT S.MapReduce:simplified data processing on large clusters[J].Communications of the ACM,2008,51(1):107-113.
[11] ZHAO W,MA H,HE Q.Parallel k-means clustering based on mapreduce[M]∥Cloud Computing.Springer Berlin Heidelberg,2009:674-679.
[12] CHEN W Y,SONG Y,BAI H,et al.Parallel spectral clustering in distributed systems[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2011,33(3):568-586.
[13] LU Wei-ming,DU Chen-yang, Wei Bao-gang,et al.Distributedaffinity propagation clustering based on map reduce[J].Journal of Computer Research and Development,2012, 49(8):1762-1772.(in Chinese) 鲁伟明,杜晨阳,魏宝刚,等.基于MapReduce的分布式近邻传播聚类算法[J].计算机研究与发展,2012,9(8):1762-1772.
[14] ZHAO Wei-dong,MA Hui-fang,FU Yan-xiang,et al.Research on Parallel k-means Algorithm Design Based on Hadoop Platform[J].Computer Science,2011,8(10):166-168.(in Chinese) 赵卫中,马慧芳,傅燕翔,等.基于云计算平台Hadoop的并行k-means聚类算法设计研究[J].计算机科学,2011,38(10):166-168.
[15] TANG Dong-ming.Affinity propagation clustering for big data based on Hadoop[J].Computer Engineering and Applications,2015,51(4):29-34.(in Chinese) 唐东明.基于Hadoop的仿射传播大数据聚类分析方法[J].计算机工程与应用,2015,51(4):29-34.
[16] AMINI M R,USUNIER N,GOUTTE C.Learning from multiple partially observed views- an application to multilingual text categorization[M]∥Advances in Neural Information Processing Systems (NIPS).2009:28-36.
[17] XIA R,PAN Y,DU L,et al.Robust multi-view spectral clustering via low-rank and sparse decomposition[C]∥AAAI Confe-rence on Artificial Intelligence.2014:2149-2155.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75, 88 .
[2] XIA Qing-xun and ZHUANG Yi. Remote Attestation Mechanism Based on Locality Principle[J]. Computer Science, 2018, 45(4): 148 -151, 162 .
[3] LI Bai-shen, LI Ling-zhi, SUN Yong and ZHU Yan-qin. Intranet Defense Algorithm Based on Pseudo Boosting Decision Tree[J]. Computer Science, 2018, 45(4): 157 -162 .
[4] WANG Huan, ZHANG Yun-feng and ZHANG Yan. Rapid Decision Method for Repairing Sequence Based on CFDs[J]. Computer Science, 2018, 45(3): 311 -316 .
[5] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[6] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[7] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[8] LIU Qin. Study on Data Quality Based on Constraint in Computer Forensics[J]. Computer Science, 2018, 45(4): 169 -172 .
[9] ZHONG Fei and YANG Bin. License Plate Detection Based on Principal Component Analysis Network[J]. Computer Science, 2018, 45(3): 268 -273 .
[10] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99, 116 .