计算机科学 ›› 2018, Vol. 45 ›› Issue (6): 67-71.doi: 10.11896/j.issn.1002-137X.2018.06.011

• 第十四届全国Web信息系统及其应用学术会议 • 上一篇    下一篇

一种面向异构大数据计算框架的监控及调度服务

胡雅鹏, 丁维龙, 王桂玲   

  1. 北方工业大学数据工程研究院 北京100144;
    大规模流数据集成与分析技术北京市重点实验室 北京100144
  • 收稿日期:2017-03-11 出版日期:2018-06-15 发布日期:2018-07-24
  • 作者简介:胡雅鹏(1991-),男,硕士生,主要研究方向为分布式系统,E-mail:huyapeng200828@126.com;丁维龙(1983-),男,博士,讲师,CCF会员,主要研究方向为实时数据处理、分布式系统,E-mail:dingweilong@ncut.edu.cn(通信作者);王桂玲(1978-),女,博士,副研究员,CCF会员,主要研究方向为服务计算、面向服务的数据集成、大规模流数据处理和集成
  • 基金资助:
    本文受国家自然科学基金(61702014),北京市自然科学基金(4172018)资助

Monitoring and Dispatching Service for Heterogeneous Big Data Computing Frameworks

HU Ya-peng, DING Wei-long, WANG Gui-ling   

  1. Data Engineering Institute,North China University of Technology,Beijing 100144,China;
    Beijing Key Laboratory on Integration and Analysis of Large-scale Stream Data,Beijing 100144,China
  • Received:2017-03-11 Online:2018-06-15 Published:2018-07-24

摘要: 各种类型的大数据计算框架存在各自专用的管理方法。传统的监控和调度服务在异构环境下的操作由于无法获取集群整体的运行状态而受到限制,且无法综合多粒度的运行时资源状态来调度不同的计算作业。这不仅浪费了集群的可用资源,而且增加了计算作业的等待时间。针对上述两个问题,提出了一种面向异构大数据计算框架的一体化监控及动态调度管理服务。该服务可以自动适应并监控多种类型的大数据计算框架及计算作业,并对多类型作业提供一体化调度。针对Hadoop和Storm两种计算框架,实现了原型系统并进行了实验。实验结果表明,所提服务在异构环境下的大数据计算框架中能降低人工操作的复杂度,并且能提高作业的调度效率。

关键词: 管理服务, 集群监控, 作业调度, 作业提交

Abstract: Various types of large data computing frameworks have their own management methods.The operation of traditional monitoring and scheduling service in heterogeneous environment is limited by the global status of cluster.It not only wastes resource of cluster,but also suffers long executive latencies of job.To solve these problems above,this paper presented an integrated monitoring and dynamic scheduling management service for heterogeneous big data computing framework.The service can monitor multiple types of computing framework automatically and provide integrated dispatching for diverse computing jobs.The work was implemented on Hadoop and Storm.The experimental results show that the service can reduce the complexity of manual operation in heterogeneous environment and improve job scheduling efficiency.

Key words: Cluster monitoring, Job scheduling, Job submission, Management service

中图分类号: 

  • TP315
[1]DONG B,SHEN Q,XIAO D B.Research on Monitoring Method of Cloud Computing Cluster Server System[J].Computer Engineering&Science,2012,34(10):69-71.(in Chinese)
董波,沈青,肖德宝.云计算集群服务器系统监控方法的研究[J].计算机工程与科学,2012,34(10):69-71.
[2]DARLING C L,GERNAEY M E,KALDESTAD H S,et al. Dynamic monitor and controller of availability of a load-balancing cluster,US 7,296,268 B2[P].2007-11-13.
[3]YUAN K.Design and Implementation of Monitoring System Based on Cloud[D].Wuhan:Huazhong University of Science and Technology,2012.(in Chinese)
袁凯.云计算环境下的监控系统设计与实现[D].武汉:华中科技大学,2012.
[4]LIU J,LIU F,ANSARI N.Monitoring and analyzing big traffic data of a large-scale cellular network with Hadoop[J].IEEE Network,2014,28(4):32-39.
[5]GUO X H,LI R Z,ZHANG Q,et al.Application research on distributed Zabbix network monitoring system[J].Journal on Communications,2013,34(Z2):95-98.(in Chinese)
郭晓慧,李润知,张茜,等.基于Zabbix的分布式服务器监控应用研究[J].通信学报,2013,34(Z2):95-98.
[6]YU G F,WANG Y C ,ZHUANG L Y,et al.Response Delay Predictive and Load Scheduling Control for the Cluster Server[J].Computer Systems Applications,2007,30(7):26-29.(in Chinese)
于国防,王耀才,庄立运,等.集群服务器响应延时预测及其负载调度控制[J].计算机系统应用,2007,30(7):26-29.
[7]XU G,YU W,CHEN Z.A Cloud Computing Based System for Cyber Security Management[J].International Journal of Parallel,Emergent and Distributed System,2016,30(1):29-45.
[8]ARAVINTH S S,BEGAM A H,SHANMUGAPRIYAA S,et al.An Efficient Hadoop Frameworks SQOOP and Ambari for Big Data Processing[J].International Journal for Innovative Reasarch in Science & Technology,2015,1(10):252-255.
[9]ZAHARIA M,BORTHAKUR D,SARMA J S,et al.JobSche-duling for Multi-User MapReduce Cluster:Technical Report No.UCB/EECS-2009-55[R].2009.
[10]WANG X T,SHEN D R,NIE T Z,et al.Batch-Job Scheduling in Shared MapReduce Environment[J].Journal of Computer Research and Development,2013,50(Suppl.):332-341.(in Chinese)
王习特,申德荣,聂铁铮,等.共享的MapReduce环境下批量作业的调度算法研究[J].计算机研究与发展,2013,50(Supp.):332-341.
[11]LI Q M,ZHANG S X,LU L,et al.A Job Scheduling Algorithm and Hybrid Scheduling Method on Hadoop Platform[J].Journal of Computer Research and Development,2013,50(Suppl.):361-368.(in Chinese)
李千目,张晟骁,陆路,等.一种Hadoop平台下的调度算法及混合调度策略[J].计算机研究与发展,2013,50(Suppl.):361-368.
[12]LI B F,ZHU Y Z,WEI R H.Implementation of Load Balancing Technology on Heterogeneous Beowulf System[J].Computer Technology And Development,2008,18(7):61-65.(in Chinese)
李丙锋,祝永志,魏榕晖.异构Beowulf系统负载均衡技术的研究与实现[J].计算机技术与发展,2008,18(7):61-65.
[13]TANG S,LEE B.DynamincMR:A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters[J].IEEETransa-ctions on Cloud Computing,2014,2(3):333-346.
[14]TIAN G Z,XIAO C B,XU Z S,et al.Hybrid Scheduling Strategy for Multiple DAGS Workflow in Heterogeneous System[J].Journal of Software,2012,23(10):2720-2734.(in Chinese)
田国忠,肖创柏,徐竹胜,等.异构分布式环境下多DAG工作流的混合调度策略[J].软件学报,2012,23(10):2720-2734.
[15]ZAHARIA M,KONWINSKI A,JOSEPH A D,et al.Improving mapreduce performance in heterogeneous environments[C]//Usenix Conference on Operating Systems Design & Implementation.2008:29-42.
[16]XU C,LIU H,TAN L.New Mechanism of Monitoring on Hadoop Cloud Platform[J].Computer Science,2013,40(1):112-117.(in Chinese)
许丞,刘洪,谭良.Hadoop云平台的一种新的任务调度和监控机制[J].计算机科学,2013,40(1):112-117.
[1] 徐蕴琪, 黄荷, 金钟.
容器技术在科学计算中的应用研究
Application Research on Container Technology in Scientific Computing
计算机科学, 2021, 48(1): 319-325. https://doi.org/10.11896/jsjkx.191100111
[2] 孙震宇,石京燕,姜晓巍,邹佳恒,杜然.
大型高能物理计算集群资源管理方法的评测
Evaluation of Resource Management Methods for Large High Energy Physics Computer Cluster
计算机科学, 2017, 44(10): 85-90. https://doi.org/10.11896/j.issn.1002-137X.2017.10.016
[3] 李智佳,胡翔,焦莉,王伟锋.
基于随机Petri网的高性能计算系统作业调度及InfiniBand网络互连的性能分析
Performance Evaluation of Job Scheduling and InfiniBand Network Interconnection in High Performance Computing System Based on Stochastic Petri Nets
计算机科学, 2015, 42(1): 33-37. https://doi.org/10.11896/j.issn.1002-137X.2015.01.007
[4] 余正样.
基于学习方式对Hadoop作业调度的改进研究
Research on Improving Hadoop Job Scheduling Based on Learning Approach
计算机科学, 2012, 39(Z6): 220-222.
[5] 王洁,曾宇.
基于自适应功耗管理的高性能计算机作业调度策略的研究
Research of Job Scheduling Strategy of High-performance Computer Based on Adaptive Power Management
计算机科学, 2012, 39(10): 313-317.
[6] .
虚拟计算环境实验床平台的设计与实现

计算机科学, 2009, 36(3): 93-96.
[7] 谈超 李小平.
双目标无等待流水线调度的加权混合算法

计算机科学, 2008, 35(11): 199-202.
[8] 曾立平 黄文奇.
一种求解车间作业调度问题的混合邻域结构搜索算法

计算机科学, 2005, 32(5): 177-180.
[9] 陈颖 杨寿保.
基于网格计算市场模型的资源与作业描述语言的研究

计算机科学, 2005, 32(2): 90-92.
[10] 韩光法 王汝传.
基于移动代理在网格计算中的结构模型研究

计算机科学, 2004, 31(10): 175-178.
[11] 孙明 周明天 詹瑾瑜.
基于CORBA集群进程管理服务的研究与设计

计算机科学, 2003, 30(5): 20-22.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!