计算机科学 ›› 2017, Vol. 44 ›› Issue (3): 23-26.doi: 10.11896/j.issn.1002-137X.2017.03.006

• 2015全国高性能计算学术年会 • 上一篇    下一篇

一种MapReduce架构下基于遗传算法的K-Medoids聚类

赖向阳,宫秀军,韩来明   

  1. 天津大学计算机科学与技术学院 天津300072天津市认知计算与应用重点实验室 天津300072,天津大学计算机科学与技术学院 天津300072天津市认知计算与应用重点实验室 天津300072,天津大学计算机科学与技术学院 天津300072天津市认知计算与应用重点实验室 天津300072
  • 出版日期:2018-11-13 发布日期:2018-11-13
  • 基金资助:
    本文受国家自然科学基金项目(61170177),国家重点基础研究发展计划项目(2013CB32930X)资助

Genetic Algorithm Based K-Medoids Clustering within MapReduce Framework

LAI Xiang-yang, GONG Xiu-jun and HAN Lai-ming   

  • Online:2018-11-13 Published:2018-11-13

摘要: 由互联网时代快速发展而产生的海量数据给传统聚类方法带来了巨大挑战,如何改进聚类算法从而获取有效信息成为当前的研究热点。K-Medoids是一种常见的基于划分的聚类算法,其优点是可以有效处理孤立、噪声点,但面临着初始中心敏感、容易陷入局部最优值、处理大数据时的CPU和内存瓶颈等问题。为解决上述问题,提出了一种MapReduce架构下基于遗传算法的K-Medoids聚类。利用遗传算法的种群进化特点改进K-Medoids算法的初始中心敏感的问题,在此基础上,利用MapReduce并行遗传K-Medoids算法提高算法效率。通过带标签的数据集进行实验的结果表明,运行在Hadoop集群上的基于MapReduce和遗传算法的K-Medoids算法能有效提高聚类的质量和效率。

关键词: 海量数据,K-Medoids,MapReduce,遗传算法,聚类效率

Abstract: Huge volumes of data are increasing exponentially with the rapid development of Internet,which poses signifi-cant challenges to traditional clustering technologies.Thus,improving the accuracy and computing performance of clustering has become a research hotspot.As one of the partition-based clustering algorithms,K-Medoids can effectively deal with the problems with isolate and noise points.However,it also suffers from problems such as sensitive to initial centers,easily falling into local optimum,CPU and memory bottlenecks with big data sets.We proposed a genetic algorithm based K-Medoids clustering under MapReduce framework.The algorithm solves the center sensitivity problem of the K-Medoids by using the genetic algorithm.Also,it is built on the MapReduce framework to boost the efficiency both for K-Medoids and the genetic algorithm.The experiments demonstrate that the proposed algorithm can effectively improve the quality and efficiency of clustering.

Key words: Big-data,K-medoids,MapReduce,Genetic algorithms,Clustering efficiency

[1] EVERSTINE G C,HENDERSON F M.Coupled finite element/boundary approach for fluid-structure interaction[J].Journal of the Acoustical Society of America,1990,87(5):1938-1947.
[2] JEANS R,MATHEWS I C.A unique coupled boundary element/finite element method for the elastoacoustic analysis of fluid-filled thin shells [J].Journal of the Acoustical Society of America,1993,94(6):3473-3479.
[3] SAAD Y.Iterative Methods for Sparse Linear Systems(2nd Edition)[M].SIAM,2003:171-194.
[4] NACHTIGAL N M,REICHEL L,TREFETHENK L N .A hybrid GMRES algorithm for nonsysmetric linear systems [J].Siam Journal on Matrix Analysis and Applications, 2006,13(8):796-825 .
[5] AN H B,BAI Z Z.A globally convergent Newton-GMRES me-thod [J].Mathematica Numerica Sinica,2005,27(2):151-174.
[6] NIU Q,LU L Z,WANG R R.A Modified Gmres Method for Solving Large Nonsymmetric Linear Systems[J].Numerical Mathematics A Journal of Chinese Universities,2005,27(S1):193-199.
[7] 朱伯芳.有限单元法原理与应用(第三版)[M].北京:中国水利水电出版社,2009:192-224.
[8] BAI M R.Application of BEM(boundary element method)-based acoustic holography to radiation analysis of sound source with arbitrarily shaped geometries [J].Journal of the Acoustical Society of America,1992,92(1):533-549.
[9] BAO X M,HE Z Y .Investigation on acoustic holography reconstruction of target scattering field [J].Acta Acustica,2000,25(3):254-264.(in Chinese) 暴雪梅,何祚镛.目标散射场全息重建方法研究[J].声学学报,2000,25(3):254-264.
[10] NELL C W,Gilroy L E.An improved BASIS model for the BeTSSi submarine[R].DRDC Atlantic TR,2003.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!