Computer Science ›› 2014, Vol. 41 ›› Issue (9): 75-79.doi: 10.11896/j.issn.1002-137X.2014.09.013

Previous Articles     Next Articles

Module Based Big Data Analysis Platform

ZHAO Wei,LIU Jie and YE Dan   

  • Online:2018-11-14 Published:2018-11-14

Abstract: As the expansion of data size,the data analysis tools that run only on stand-alone computers are no longer sufficient.We designed and implemented a module based big data analysis platform named Haflow to solve this problem.In Haflow,we defined an analysis business model and an extensible module interface,which supports integration of heterogeneous tools.Users submit their analysis flows,and system will interpret them,and then commit to the Hadoop.Haflow is an extensible,distributed,heterogeneous supported,service oriented big data platform.The goal of the platform is twofold.First,it provides an platform that encapsulates the dummy jobs that have nothing to do with the business itself,improving the development speed of analysis applications.Second,by submitting jobs to Hadoop which can execute jobs concurrently,the platform decreases the mean time of the analysis applications.

Key words: Big data,Data analysis,Data mining,Module,Distributed,Service,Platform

[1] Islam M,Huang A K,Battisha M,et al.Oozie:towards a scalable workflow management system for Hadoop[C]∥Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies (SWEET’12).2012,4:1-10
[2] http://oozie.apache.org/docs/3.3.2/index.html
[3] http://www.cs.waikato.ac.nz/ml/weka/
[4] http://mahout.apache.org/
[5] http://www.r-project.org/
[6] 纪俊.一种基于云计算的数据挖掘平台架构设计与实现[D].青岛:青岛大学,2009
[7] 余永红,向晓军,高阳,等.面向服务的云数据挖掘引擎的研究[J].计算机科学与探索,2012,6(1):46-57
[8] 钱肖鲁,朱建秋,朱扬勇.DMVisualMiner:一个可视化数据挖掘分析平台[J].计算机工程,2003,29:148-150
[9] 丁岩,杨庆平,钱煜明.基于云计算的数据挖掘平台架构及其关键技术研究[J].中兴通讯技术,2013,9(1):53-60

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!