Computer Science ›› 2011, Vol. 38 ›› Issue (10): 166-168.
Previous Articles Next Articles
ZHAO Wei-zhong,MA Hui-fang,FU Yan-xiang,SHI Zhong-zhi
Online:
Published:
Abstract: In the past decades, data clustering has been studied extensively and a mass of methods and theories have been achieved. However, with the development of database and popularity of Internet, a lot of new challenges such as massive data and new computing environment lie in the research on data clustering. We conducted a deep research on parallel k-means algorithm based onHadoop, which is a new cloud computing platform. We showed how to design parallel k-means algorithms on Hadoop. Experiments on different size of datasets demonstrate that our proposed algorithm shows good performance on speedup,scaleup and sizeup. Thus it fits to data clustering on huge datasets.
Key words: Cloud computing,Hadoop,Parallel k-means,MapReduce
ZHAO Wei-zhong,MA Hui-fang,FU Yan-xiang,SHI Zhong-zhi. Research on Parallel k-means Algorithm Design Based on Hadoop Platform[J].Computer Science, 2011, 38(10): 166-168.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: https://www.jsjkx.com/EN/
https://www.jsjkx.com/EN/Y2011/V38/I10/166
Cited