计算机科学 ›› 2017, Vol. 44 ›› Issue (Z6): 567-570.doi: 10.11896/j.issn.1002-137X.2017.6A.127

• 综合、交叉与应用 • 上一篇    下一篇

Hadoop集群环境下集成抢占式调度策略的本地性调度算法设计

王越峰,王溪波   

  1. 沈阳工业大学信息科学与工程学院 沈阳110870,沈阳工业大学信息科学与工程学院 沈阳110870
  • 出版日期:2017-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受辽宁“百千万人才工程”培养经费(2012921041)资助

Design of Local Scheduling Algorithm for Integrated Preemptive Scheduling Policy in Hadoop Cluster Environment

WANG Yue-feng and WANG Xi-bo   

  • Online:2017-12-01 Published:2018-12-01

摘要: 在Hadoop集群环境下本地性调度算法是提高数据本地性的算法。本地性调度算法的调度策略的本质是提高数据本地性,减少网络传输开销,避免阻塞。但是由于Map任务的完成时间不同,Reduce任务存在的等待现象影响了作业的平均完成时间,使得作业的完成时间增加,进而引起系统的性能参数不佳。因此提出在保留原算法数据本地性要求的基础上集成可抢占式的调度方法。在Reduce任务等待时,挂起该任务并释放资源给其他Map任务,当Map任务完成到一定程度后,重新调度Reduce任务。基于上述调度策略设计了集成抢占式策略的本地性调度。为了对改进的算法进行验证,通过实验对本地性调度算法和集成抢占式本地性调度算法进行比较。实验结果表明,在相同数据上,集成抢占式本地性调度算法的平均完成时间有明显的降低。

关键词: 数据本地性,抢占式,作业平均完成时间

Abstract: Local scheduling algorithm is an algorithm to improve data locality in Hadoop cluster environment.The nature of the scheduling strategy of the local scheduling algorithm is to improve the data locality,reduce network transmission and avoid congestion.However,due to the different completion time of the Map task,the waiting phenomenon of Reduce task affects the completion average time of the job,the completion time of the job is increased,and then the performance parameters of the system are not good.In this thesis,we proposed to integrate the preemptive scheduling based on the local requirement of the original algorithm.When the Reduce task waits,the task is supended and the resource is rleased to other Map tasks.Based on the above scheduling strategy,this thesis designed the qualitative scheduling of integrated preemptive strategy.In order to validate the improved algorithm,the local scheduling algorithm and the integrated preemptive local scheduling algorithm were compared by experiments.Experimental results show that,on the same data,the average completion time of the integrated preemptive local scheduling algorithm is significantly reduced.

Key words: Data locality,Preemptive,Average completion time for the job

[1] Hadoop[EB/OL].[2014-2-01].http//hadoop.apache.org.
[2] SHVACHKO K,KUANG H,RADIA S,et al.The hadoop distributed file system[C]∥Proceedings of the 26th IEEE Symposium on Mass Storage Systems and Technologies.IEEE,2010:1-10.
[3] DEAN J,GHEMAWAT S.MapReduce.Simplified data proces-sing on large clusters[J].Communications of the ACM,2008,51(1):107-113.
[4] ZAHARIA M,BORTHAKU D,SARMA J S,et al.Job Scheduling for Multi-user Mapreduce Clusters[R].EECS Department,University of California,Berkeley,Tech.2009.
[5] 董西成.Hadoop 技术内幕:深入解析MapReduce架构设计与实现原理[M].北京:机械工业出版社,2013.
[6] 胡丹,于炯.Hadoop平台下改进的LATE调度算法[J].计算机工程与应用,2014,50(4):86-89.
[7] 何文峰.基于任务特征与公平策略的Hadoop作业调度算法研究[D].武汉:华中科技大学,2013.
[8] 燕明磊.Hadoop集群中作业调度研究[J].软件导刊,2015,14(4):1-2.
[9] 储雅,马廷淮.云计算资源调度:策略与算法[J].计算机科学,2013,0(11):8-13.
[10] 陶昌俊.Hadoop平台的作业调度算法[D].合肥:中国科学技术大学,2015.
[11] PALANISAMY B,SINGH A,LIU L,et al.Purlieus:locality-aware resource allocation for MapReduce in a cloud[C]∥Proceedings of 2011 International Conference for High Performance Computing,Networking,Storage and Analysis.2011.
[12] HAMMOUD M,SAKR M F.Locality-Aware Reduce Task Sche-duling for MapReduce[C]∥Proceedings of International Conference on Cloud Computing Technology & Science.Beijing,2011:570-576.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!