计算机科学 ›› 2022, Vol. 49 ›› Issue (3): 11-22.doi: 10.11896/jsjkx.210900117
田冰川, 田臣, 周宇航, 陈贵海, 窦万春
TIAN Bing-chuan, TIAN Chen, ZHOU Yu-hang, CHEN Gui-hai, DOU Wan-chun
摘要: 大数据分析系统的用户希望任务的执行时间尽可能短。然而,在任务执行期间,网络与计算时刻都可能成为阻碍任务执行的资源瓶颈。通过对大数据分析系统的观察与分析,得出如下结论:1)根据当前资源瓶颈的不同,数据并行框架应当在多种工作模式之间切换;2)子任务的调度应当充分考虑将来可能到达的新任务,而不能仅考虑当前已经提交的任务。基于上述观察,设计并实现了全新的任务调度系统Duopoly,其由感知计算资源的网络调度器cans与感知网络资源的子任务调度器nats两部分组成。通过小规模物理集群与大规模仿真实验对Duopoly的效果进行评估,实验结果表明,与现有工作相比,Duopoly可以将平均任务完成时间缩短37.30%~76.16%。
中图分类号:
[1]OUSTERHOUT K,WENDELL P,ZAHARIA M,et al.Spar-row:distributed,low latency scheduling [C]//Proceedings of the 24th Symposium on Operating Systems Principles(SOSP 2013).New York:ACM,2013:69-84. [2]SHINNAR A,CUNNINGHAM D,SARASWAT V,et al.M3r:increased performance for in-memory hadoop jobs[J].Procee-dings of the VLDB Endowment,2012,5(12):1736-1747. [3]ZAHARIA M,CHOWDHURY M,DAS T,et al.Resilient distributed datasets:A fault-tolerant abstraction for in-memory cluster computing [C]//Proceedings of the 9th USENIX Confe-rence on Networked Systems Design and Implementation(NSDI 2012).Berkeley:USENIX Association,2012. [4]TRIVEDI A,STUEDI P,PFEFFERLE J,et al.On the [ir] relevance of network performance for data processing [C]//Proceedings of the 8th USENIX Conference on Hot Topics in Cloud Computing(HotCloud 2016).Berkeley:USENIX Association,2016:126-131. [5]OUSTERHOUT K,CANEL C,RATNASAMY S,et al.Mono-tasks:Architecting for performance clarity in data analytics frameworks [C]//Proceedings of the 26th Symposium on Ope-rating Systems Principles(SOSP 2017).New York:ACM,2017:184-200. [6]OUSTERHOUT K,RASTI R,RATNASAMY S,et al.Making sense of performance in data analytics frameworks [C]//Proceedings of the 12th USENIX Conference on Networked Systems Design and Implementation(NSDI 2015).Berkeley:USENIX Association,2015:293-307. [7]CHOWDHURY M,ZAHARIA M,MA J,et al.Managing data transfers in computer clusters with orchestra [C]//Proceedings of the ACM SIGCOMM 2011 Conference(SIGCOMM 2011).New York:ACM,2011:98-109. [8]DOGAR F R,KARAGIANNIS T,BALLANI H,et al.Decentralized task-aware scheduling for data center networks[J].ACM SIGCOMM Computer Communication Review,2014,44(4):431-442. [9]CHOWDHURY M,ZHONG Y,STOICA I.Efficient coflowscheduling with varys[J].ACM SIGCOMM Computer Communication Review,2014,44(4):443-454. [10]CHOWDHURY M,STOICA I.Efficient coflow schedulingwithout prior knowledge[J].ACM SIGCOMM Computer Communication Review,2015,45(4):393-406. [11]ZAHARIA M,BORTHAKUR D,SARMA J S,et al.Delayscheduling:a simple technique for achieving locality and fairness in cluster scheduling [C]//Proceedings of the 5th European Conference on Computer Systems(EuroSys 2010).New York:ACM,2010:265-278. [12]ISARD M,PRABHAKARAN V,CURREY J,et al.Quincy:fair scheduling for distributed computing clusters [C]//Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles(SOSP 2009).New York:ACM,2009:261-276. [13]ANANTHANARAYANAN G,KANDULA S,GREENBERGA G,et al.Reining in the outliers in map-reduce clusters using mantri [C]//Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation(OSDI 2010).Berkeley:USENIX Association,2010:265-278. [14]AHMAD F,CHAKRADHAR S T,RAGHUNATHAN A,et al.Shuffle-aware scheduling in multi-tenant map-reduce clusters [C]//Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference(USENIX ATC 2014).Berkeley:USENIX Association,2014:1-12. [15]JALAPARTI V,BODIK P,MENACHE I,et al.Network-aware scheduling for data-parallel jobs:Plan when you can[J].ACM SIGCOMM Computer Communication Review,2015,45(4):407-420. [16]CHEN Y,ALSPAUGH S,KATZ R.Interactive analytical processing in big data systems:A cross-industry study of mapreduce workloads[J].Proceedings of the VLDB Endowment,2012,5(12):1802-1813. [17]TAN J,CHIN A,HU Z Z.Dynmr:Dynamic mapreduce with reduce task interleaving and maptask backfilling [C]//Procee-dings of the Ninth European Conference on Computer Systems(EuroSys 2014).New York:ACM,2014:1-14. [18]KAVULYA S,TAN J,GANDHI R,et al.An analysis of traces from a production mapreduce cluster [C]//Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster,Cloud and Grid Computing(CCGRID 2010).Washington:IEEE,2010:94-103. [19]CHOWDHURY M,KANDULA S,STOICA I.Leveraging endpoint flexibility in data-intensive clusters[J].ACM SIGCOMM Computer Communication Review,2013,43(4):231-242. [20]CHEN Y,GANAPATHI A,GRIFFITH R,et al.The case forevaluating mapreduce performance using workload suites [C]//Proceedings of the 2011 IEEE 19th Annual International Symposium on Modelling,Analysis,and Simulation of Computer and Telecommunication Systems(MASCOTS 2011).Washington:IEEE,2011:390-399. [21]COSTA P,DONNELLY A,ROWSTRON A,et al.Camdoop:Exploiting in-network aggregation for big data applications [C]//Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation(NSDI 2012).Berkeley:USENIX Association,2012. [22]BAI W,CHEN L,CHEN K,et al.Information-agnostic flowscheduling for commodity data centers[C]//Proceedings of the 12th USENIX Conference on Networked Systems Design and Implementation(NSDI 2015).Berkeley:USENIX Association,2015:455-468. [23]PENG Y,CHEN K,WANG G,et al.Hadoopwatch:A first step towards comprehensive traffic forecasting in cloud computing [C]//The 33rd Annual IEEE International Conference on Computer Communications(INFOCOM 2014).Piscataway:IEEE,2014:19-27. [24]WANG H,CHEN L,CHEN K,et al.Flowprophet:Generic andaccurate traffic prediction for data-parallel cluster computing [C]//International Conference on Distributed Computing Systems(ICDCS 2015).Piscataway:IEEE,2015:349-358. [25]MUNIR A,HE T,RAGHAVENDRA R,et al.Network scheduling aware task placement in datacenters [C]//Proceedings of the 12th International on Conference on Emerging Networking Experiments and Technologies(CoNEXT 2016).New York:ACM,2016:221-235. [26]CHANG H,KODIALAM M,KOMPELLA R R,et al.Scheduling in mapreduce-like systems for fast completion time [C]//The 30th IEEE International Conference on Computer Communications(INFOCOM 2011).Piscataway:IEEE,2011:3074-3082. [27]CHEN F,KODIALAM M,LAKSHMAN T.Joint scheduling of processing and shuffle phases in mapreduce systems [C]//The 31th IEEE International Conference on Computer Communications(INFOCOM 2012).Piscataway:IEEE,2012:1143-1151. |
[1] | 沈彪, 沈立炜, 李弋. 空间众包任务的路径动态调度方法 Dynamic Task Scheduling Method for Space Crowdsourcing 计算机科学, 2022, 49(2): 231-240. https://doi.org/10.11896/jsjkx.210400249 |
[2] | 谭双杰, 林宝军, 刘迎春, 赵帅. 基于机器学习的分布式星载RTs系统负载调度算法 Load Scheduling Algorithm for Distributed On-board RTs System Based on Machine Learning 计算机科学, 2022, 49(2): 336-341. https://doi.org/10.11896/jsjkx.201200126 |
[3] | 王政, 姜春茂. 一种基于三支决策的云任务调度优化算法 Cloud Task Scheduling Algorithm Based on Three-way Decisions 计算机科学, 2021, 48(6A): 420-426. https://doi.org/10.11896/jsjkx.201000023 |
[4] | 蔡凌峰, 魏祥麟, 邢长友, 邹霞, 张国敏. 故障场景下的边缘计算DAG任务重调度方法 Failure-resilient DAG Task Rescheduling in Edge Computing 计算机科学, 2021, 48(10): 334-342. https://doi.org/10.11896/jsjkx.210300304 |
[5] | 张龙信, 周立前, 文鸿, 肖满生, 邓晓军. 基于异构云计算的成本约束下的工作流能量高效调度算法 Energy Efficient Scheduling Algorithm of Workflows with Cost Constraint in Heterogeneous Cloud Computing Systems 计算机科学, 2020, 47(8): 112-118. https://doi.org/10.11896/jsjkx.200300038 |
[6] | 孙敏, 陈中雄, 叶侨楠. 云环境下基于HEDSM的工作流调度策略 Workflow Scheduling Strategy Based on HEDSM Under Cloud Environment 计算机科学, 2020, 47(6): 252-259. https://doi.org/10.11896/jsjkx.190400047 |
[7] | 胡俊钦, 张佳俊, 黄引豪, 陈星, 林兵. 边缘环境下DNN应用的计算迁移调度技术 Computation Offloading Scheduling Technology for DNN Applications in Edge Environment 计算机科学, 2020, 47(10): 247-255. https://doi.org/10.11896/jsjkx.190900106 |
[8] | 张洲, 黄国锐, 金培权. 基于Storm的任务调度:现状与研究展望 Task Scheduling on Storm:Current Situations and Research Prospects 计算机科学, 2019, 46(9): 28-35. https://doi.org/10.11896/j.issn.1002-137X.2019.09.004 |
[9] | 曾金晶, 张建山, 林兵, 张文德. 基于无线城域网的微云负载均衡算法 Cloudlet Workload Balancing Algorithm in Wireless Metropolitan Area Networks 计算机科学, 2019, 46(8): 163-170. https://doi.org/10.11896/j.issn.1002-137X.2019.08.027 |
[10] | 张建山, 林兵, 卢宇, 许芙蓉. 基于无线城域网的微云部署及用户任务调度 Cloudlet Placement and User Task Scheduling Based on Wireless Metropolitan Area Networks 计算机科学, 2019, 46(6): 128-134. https://doi.org/10.11896/j.issn.1002-137X.2019.06.019 |
[11] | 叶符明, 李雯婷, 王颖. MC2ETS:移动云计算中一种能效任务调度算法 MC2ETS:An Energy-efficient Tasks Scheduling Algorithm in Mobile Cloud Computing 计算机科学, 2019, 46(6): 135-142. https://doi.org/10.11896/j.issn.1002-137X.2019.06.020 |
[12] | 马小晋,饶国宾,许华虎. 云计算中任务调度研究的调查 Research on Task Scheduling in Cloud Computing 计算机科学, 2019, 46(3): 1-8. https://doi.org/10.11896/j.issn.1002-137X.2019.03.001 |
[13] | 王卓昊, 杨冬菊, 徐晨阳. 基于ISE算法的分布式ETL任务调度策略研究 Research on Distributed ETL Tasks Scheduling Strategy Based on ISE Algorithm 计算机科学, 2019, 46(12): 1-7. https://doi.org/10.11896/jsjkx.190100023 |
[14] | 徐俊, 项倩红, 肖刚. 基于改进混合蛙跳算法的云工作流负载均衡调度优化 Load Balancing Scheduling Optimization of Cloud Workflow Using Improved Shuffled Frog Leaping Algorithm 计算机科学, 2019, 46(11): 315-322. https://doi.org/10.11896/jsjkx.181001866 |
[15] | 吴修国, 刘翠. 云存储系统中最小开销的数据副本布局转换策略 Data Replicas Distribution Transition Strategy in Cloud Storage System 计算机科学, 2019, 46(10): 202-208. https://doi.org/10.11896/jsjkx.180901623 |
|