计算机科学 ›› 2015, Vol. 42 ›› Issue (6): 28-31, 45.doi: 10.11896/j.issn.1002-137X.2015.06.006

• 第十届和谐人机环境联合学术会议 • 上一篇    下一篇

一种Hadoop中基于作业类别和截止时间的调度算法

李曌,滕飞,李天瑞,杨浩   

  1. 西南交通大学信息与科学技术学院 成都610031,西南交通大学信息与科学技术学院 成都610031,西南交通大学信息与科学技术学院 成都610031,西南交通大学信息与科学技术学院 成都610031
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家自然科学基金(61202043,7)资助

Scheduler Algorithm Based on Type Specific and Deadline in Hadoop

LI Zhao, TENG Fei, LI Tian-rui and YANG Hao   

  • Online:2018-11-14 Published:2018-11-14

摘要: Hadoop是一种开源可靠的分布式计算框架,而MapReduce是处理超大规模数据集的编程模型。鉴于Hadoop内置的调度器不能很好地处理类别不同且有截止时间的作业的调度,提出了一种基于作业类别和截止时间的作业调度算法。作业分为CPU密集型和I/O密集型,并根据截止时间设置优先级来实现作业的调度。实验结果表明,该算法在充分利用集群的CPU和磁盘I/O的同时,能满足作业的截止期需求,当同一时间段内截止时间相近时算法达到最优,当某一队列中作业截止时间均比另一种队列短时,算法效率最低。

关键词: 调度算法,截止时间,作业类别,MapReduce,Hadoop

Abstract: Hadoop develops open-source software for reliable,scalable,distributed computing.MapReduce is a programming model and an associated implementation for processing large data sets.Because the built-in Hadoop scheduler cannot handle the different type and deadline based jobs,we proposed a scheduler algorithm based on type specific and deadline.We specified these jobs into CPU-bound and I/O-bound and gave priority to jobs according to the deadline.The results of experiments show that the proposed algorithm not only makes full use of the cluster’s CPU and I/O resource,but also meets the jobs’ deadline.If the deadline is almost the same at a period of time,the algorithm is the best.But if jobs’ deadlines from one queue are all shorter than another queue,the efficiency of the algorithm achieves the minimum.

Key words: Scheduler algorithm,Deadline,Job-type,MapReduce,Hadoop

[1] White T.Hadoop权威指南[M].周敏,译.北京:清华大学出版社,2011:23-55 White T.Hadoop:The Definitive Guide[M].Zhou Min.Beijing:Tsinghua University Press,2011:23-55
[2] Schwarakopf M,Konwinski A.Omega:flexible,scalable sche-dulers for large compute clusters[J].EuroSys’13 Proceeding of the 8th ACM European Conference on Computer Systems.2013:351-364
[3] 范帆.Hadoop中基于优先级的调度算法研究[D].上海:复旦大学,2012,8 Fan Fan.A Priority-basd Scheduling Algorithm for Hadoop[D].Shanghai:Fudan Unversity,2012,8
[4] Tian Chao,Zhou Hao-jie,He Yong-qiang,et al.A Dynamic Map-Reduce Scheduler for Heterogeneous Workloads[C]∥Eigth International Conference on Grid and Cooperative Computing(GCC ’09).2009:218-224
[5] Teng Fei,Yang Hao,Li Tian-rui,et al.Scheduling real-timeworkflow on MapReduce-based cloud[C]∥2013 Third International Conference on Innovative Computing Technology.2013:117-122
[6] Kc K,Anyanwu K.Scheduling Hadoop Jobs to Meet Deadlines[C]∥2010 IEEE Second International Coference on Cloud Computing Technology and Science.2010:388-392
[7] Zhang Xiao-hong,Ju Shuai,Jiao Zhi-bin.A Scheduling Method Based on Deadlines in MapReduce[J].Electrical,Information Engineering and Mechatronics 2011 Lecture Notes in Electrical Engineering,2012,138:1585-1592
[8] Tang Zhuo,Zhou Jun-qing,Li Ken-li,et al.MTSD:A taskscheduling algorithm for MapReduce base on deadline constraints[C]∥2012 IEEE 26th International Parallel and Distri-buted Processing Symposium Workshops & PhD Forum.2012:2012-2018
[9] Ning Wen-yu,Wu Qing-bo,Tan Yu-song.MapReduce oriented self-adaptive delay scheduling algorithm[J].Computer Engineering & Science,2013(3):52-57
[10] 杨浩,滕飞,李天瑞,等.Hadoop平台中空闲时间调度器的设计与实现[J].计算机工程与科学,2013(10):125-131 Yang Hao,Teng Fei,Li Tian-rui,et al.Design and implementation of a least spare time scheduler for Hadoop[J].Computer Engineering & Science,2013(10):125-131
[11] 陈国营.基于MapReduce模型文本分类算法的研究[D].辽宁:辽宁大学,2013:1-10 Chen Guo-ying.Design and Implementation of Text Classification Algorithm Based on Hadoop[D].Liaoning:Liaoning University,2013:1-10
[12] 韩定一.云推荐—大数据时代的个性化互联网服务解决之道[J].程序员,2013(3):16-17 Han Ding-yi.Cloud recommend-The road to solve personalized service on the era of big data[J].Programmer,2013(3):16-17

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 雷丽晖,王静. 可能性测度下的LTL模型检测并行化研究[J]. 计算机科学, 2018, 45(4): 71 -75, 88 .
[2] 夏庆勋,庄毅. 一种基于局部性原理的远程验证机制[J]. 计算机科学, 2018, 45(4): 148 -151, 162 .
[3] 厉柏伸,李领治,孙涌,朱艳琴. 基于伪梯度提升决策树的内网防御算法[J]. 计算机科学, 2018, 45(4): 157 -162 .
[4] 王欢,张云峰,张艳. 一种基于CFDs规则的修复序列快速判定方法[J]. 计算机科学, 2018, 45(3): 311 -316 .
[5] 孙启,金燕,何琨,徐凌轩. 用于求解混合车辆路径问题的混合进化算法[J]. 计算机科学, 2018, 45(4): 76 -82 .
[6] 张佳男,肖鸣宇. 带权混合支配问题的近似算法研究[J]. 计算机科学, 2018, 45(4): 83 -88 .
[7] 伍建辉,黄中祥,李武,吴健辉,彭鑫,张生. 城市道路建设时序决策的鲁棒优化[J]. 计算机科学, 2018, 45(4): 89 -93 .
[8] 刘琴. 计算机取证过程中基于约束的数据质量问题研究[J]. 计算机科学, 2018, 45(4): 169 -172 .
[9] 钟菲,杨斌. 基于主成分分析网络的车牌检测方法[J]. 计算机科学, 2018, 45(3): 268 -273 .
[10] 史雯隽,武继刚,罗裕春. 针对移动云计算任务迁移的快速高效调度算法[J]. 计算机科学, 2018, 45(4): 94 -99, 116 .