计算机科学 ›› 2014, Vol. 41 ›› Issue (Z6): 42-46.
唐一韬,黄晶,肖球
TANG Yi-tao,HUANG Jing and XIAO Qiu
摘要: Hadoop已成为研究云计算的基础平台,MapReduce是其大数据分布式处理的计算模型。针对异构集群下MapReduce数据分布、数据本地性、作业执行流程等问题,提出一种基于DAG的MapReduce调度算法。把集群中的节点按计算能力进行划分,将MapReduce作业转换成DAG模型,改进向上排序值计算方法,使其在异构集群中计算更精准、任务的优先级排序更合理。综合节点的计算能力与数据本地性及集群利用情况,选择合理的数据节点分配和执行任务,减少当前任务完成时间。实验表明,该算法能合理分布数据,有效提高数据本地性,减少通信开销,缩短整个作业集的调度长度,从而提高集群的利用率。
[1] Dean J,Ghemawat S.MapReduce:Simplified Data Processing on Large Clusters[J].Communications of the ACM,2008,51(1):107-113 [2] Apache Hadoop.Hadoop[EB/OL].http://hadoop.apache.org/,2009-03-06 [3] Vaquero L M,Rodero-Merino L,Caceres J,et al.A Break In the cloud:Towards a Ckoud Definition[J].ACM SIGCOMM Computer Communication Review,2009,39(1):50-55 [4] 陆嘉恒.Hadoop实战(第3版)[M].北京:机械工业出版社,2011 [5] Zaharia M,Borthakur D,Sarma J S,et al.Delay Scheduling:A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling[C]∥Proceedings of the 5th European Conference on Computer Systems.2010:265-278 [6] Xie J,Yin S,Ruan X J,et al.Improving MapReduce Performancethrough Data Placement in Heterogeneous Hadoop Clusters[C]∥IEEE International Symposium on Parallel & Distributed Processing,Workshops and PhdForum.2010:1-9 [7] Zhang X H,Zhong Z Y,Feng S Z,et al.Improving Data Localityof MapReduce by Scheduling in Homogeneous Computing Environments[C]∥IEEE 9th International Symposium on Parallel and Distributed Processing with Applications.2011:120-126 [8] Guo Lei-tao,Sun Hong-wei,et al.A data distribution aware taskscheduling strategy for mapreduce system[C]∥First International Conference on Cloud Computing.2009 [9] Verma A,Cherkasova L,Campbell R.Resource ProvisioningFramework for MapReduce Jobs withPerformance Goals[J].Lecture Notes in Computer Science,2011,9:165-186 [10] Polo J,Carrera D,et al.Performance-driven task co-scheduling for mapreduce environments[C]∥Proc of IEEE/IFIP Network Operations and Management Symposium.2010 [11] Kc K,Anyanwu K.Scheduling Hadoop Jobs to Meet Deadlines[C]∥IEEE Second International Conference on Cloud Computing Technology and Science.2010:388-392 [12] Polo J,Carrera D,Becerra Y,et al.Performance-Driven Task Co-Scheduling for MapReduce Environments[C]∥IEEE proceedings of Network Operations and Management Symposium.2010:373-380 [13] Tang Zhuo,Zhou Jun-qing,Li Ken-li,et al.MTSD:A taskscheduling algorithm for MapReduce base on deadline constraints[C]∥IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.2012 [14] Zaliva V,Orlov V.Hamake:A Data Flow Approach to DataProcessing in Hadoop[C]∥CLOSER.2012:457-461 [15] Furst S.Challenges in the design of automotive software[C]∥Proceedings of the Conference on Design,Automation and Test in Europe.European Design and Automation Association,2010:256-258 [16] Arabnejad H,Barbosa J.Fairness resource sharing for dynamicworkflow scheduling on Heterogeneous Systems[C]∥Parallel and Distributed Processing with Applications (ISPA),2012IEEE 10th International Symposium on.IEEE,2012:633-639 [17] Klobedanz K,Koenig A,Mueller W.A reconfiguration approach for fault-tolerant flexray networks[C]∥Design,Automation & Test in Europe Conference & Exhibition (DATE),2011.IEEE,2011:1-6 |
No related articles found! |
|