Computer Science ›› 2015, Vol. 42 ›› Issue (11): 178-183.doi: 10.11896/j.issn.1002-137X.2015.11.037

Previous Articles     Next Articles

Prediction Model of Energy Consumption for MapReduce Based on Job Running History Logs

LIAO Bin, ZHANG Tao, YU Jiong and SUN Hua   

  • Online:2018-11-14 Published:2018-11-14

Abstract: The problem of high energy consumption produced in big data processing is an important issue that needs to be solved,especially under the background of data explosion.The energy consumption model is the basis for research to improve the energy efficiency of the MapReduce.Using traditional model to calculate the MapReduce job's energy consumption faces challenges.After research on the cluster structure,job task decomposition and task slot mapping mechanism,we proposed the prediction model of energy consumption for MapReduce based on job running history logs.Through the analysis of historical operating information of different jobs,we got the computing power and energy consumption characteristics of DataNode running different tasks,and then implemented the forecast of the energy consumption of the MapReduce job before its execution process.The experimental results demonstrate the feasibility of energy prediction model,and the purpose of improving the prediction accuracy of the model can be achieved by adjusting the correction factor.

Key words: Green computing,MapReduce,Energy consumption modeling,Prediction model

[1] 孟小峰,慈祥.大数据管理:概念、技术与挑战[J].计算机研究与发展,2013,0(1):146-149 Meng X F,Ci X.Big Data Management:Concepts,Techniques and Challenges[J].Journal of Computer Research and Development,2013,50(1):146-149
[2] Gantz J,Chute C,Manfrediz A,et al.The diverse and exploding digital universe:An updated forecast of worldwide information growth through 2011 [EB/OL].2013-5-25. library/book268.pdf
[3] Global action plan,an inefficient truth [EB/OL].2007 .2011-02-12.
[4] Times N Y.Power,Pollution and the Internet [EB/OL].2013-5-20. data-ceneters-waste-vast-amounts-of-energy-belying-industry-image.html
[5] Dean J,Ghemawat S.MapReduce:Simplifed data processing on large clusters[C]∥Proceedings of the Conference on Operating System Design and Implementation(OSDI).New York:ACM,2004:137-150
[6] Barroso L A,Hlzle U.The datacenter as a computer:An introduction to the design of warehouse-scale machines [R].Morgan:Synthesis Lectures on Computer Architecture,Morgan & Claypool Publishers,2009
[7] 王鹏,孟丹,詹剑锋,等.数据密集型计算编程模型研究进展[J].计算机研究与发展,2010,7(11):1993-2002Wang P,Meng D,Zhan J F,et al.Review of Programming mo-dels for data-Intensive computing[J].Journal of Computer Research and Development,2010,47(11):1993-2002
[8] Li D,Wang J E.Energy efficient redundant and inexpensive disk array [C]∥Proceedings of the ACM SIGOPS European Workshop.New York:ACM,2004:29-35
[9] Albers S.Energy-efficient algorithms [J].Communications ofthe ACM,2010,53(5):86-96
[10] Wierman A,Andrew L L,Tang A.Power-aware speed scaling in processor sharing systems [C]∥Proceedings of the 28th Conference on Computer Communications(INFOCOM 2009).Piscataway,NJ,IEEE,2009:2007-2015
[11] Andrew L L,Lin M,Wierman A.Optimality,fairness,and robustness in speed scaling designs [C]∥Proceedings of ACM International Conference on Measurement and Modeling of International Computer Systems(SIGMETRICS 2010).New York:ACM,2010:37-48
[12] Meisner D,Gold B T,Wenisch T F.PowerNap:Eliminatingserver idle power [J].ACM SIGPLAN Notices,2009,44(3):205-216
[13] Choi J,Govindan S,Jeong J,et al.Power consumption prediction and power-aware packing in consolidated environments[J].IEEE Transactions on Computers,2010,59(12):1640-1654
[14] Liao X,Jin H,Liu H.Towards a green cluster through dynamic remapping of virtual machines[J].Future Generation Computer Systems,2012,28(2):469-477
[15] Jang J W,Jeon M,Kim H S,et al.Energy reduction in consolidated servers through memory-aware virtual machine scheduling[J].IEEE Transactions on Computers,2011,99(1):552-564
[16] Wang X,Wang Y.Coordinating power Control and performance management for virtualized server cluster[J].IEEE Transactions on Parallel and Distributed Systems,2011,22(2):245-259
[17] Wang Y,Wnag X,Chen M,et al.Partic:Power-aware response time control for virtualized web servers[J].IEEE Transactions on Parallel and Distributed Systems,2011,22(2):323-336
[18] Garg S K,Yeo C S,Anandasivam A,et al.Environment-con-scious scheduling of HPC applications on distributed cloud-orie-nted data centers [J].Journal of Parallel and Distributed Computing,2010,71(6):732-749
[19] Kusic D,Kephart J O,Hanson J E,et al.Power and performance management of virtualized computing environments via lookahead control [J].Cluster Computing,2009,12(1):1-15
[20] Gmach D,Rolia J,Cherkasova L,et al.Resource pool management:Reactive versus proactive or let’s be friends[J].Compu-ter Networks,2009,53(17):2905-2922
[21] 廖彬,于炯,张陶,等.基于分布式文件系统HDFS的节能算法[J].计算机学报,2013,6(5):1047-1064 Liao B,Yu J,Zhang T,et al.Energy-Efficient Algorithms for Distributed File System HDFS[J].Chinese Journal of Compu-ters,2013,36(5):1047-1064
[22] 廖彬,于炯,孙华,等.基于存储结构重配置的分布式存储系统节能算法[J].计算机研究与发展,2013,50(1):3-18 Liao B,Yu J,Sun H,et al.Energy-Efficient Algorithms for Distributed Storage System Based on Data Storage Structure Reconfiguration[J].Journal of Computer Research and Development,2013,50(1):3-18
[23] 廖彬,于炯,钱育蓉,等.基于可用性度量的分布式文件系统节点失效恢复算法[J].计算机科学,2013,40(1):144-149 Liao B,Yu J,Qian Y R,et al.The Node Failure Recovery Algorithm for Distributed File System based on Measurement of Data Availability[J].Computer Sicence,2013,40(1):144-149
[24] 廖彬,于炯,张陶,等.一种适应节能的云存储系统元数据动态建模与管理方法[J].小型微型计算机系统,2013,10(34):2407-2412 Liao B,Yu J,Zhang T,et al.A Novel Energy-efficient Metadata Dynamic Modeling and Management Approach for Cloud Sto-rage System[J].Journal of Chinese Computer Systems,2013,10(34):2407-2412
[25] Leverich J,Kozyrakis C.On the energy(in)efficiency of hadoop clusters [J].ACM SIGOPS Operating Systems Review,2010,44(1):61-65
[26] Lang W,Patel J M.Energy management for mapreduce clusters[J].Proceedings of the VLDB Endowment,2010,3(1/2):129-139
[27] Chen Y,Keys L,Katz R H.Towards energy effcient mapreduce[R].Berkeley:EECS Department,University of California,2009
[28] Wirtz T,Ge R.Improving MapReduce energy efficiency for computation intensive workloads[C]∥2011 International Green Computing Conference and Workshops(IGCC).IEEE,2011:1-8
[29] Goiri í,Le K,Nguyen T D,et al.GreenHadoop:leveraging green energy in data-processing frameworks[C]∥Proceedings of the 7th ACM European Conference on Computer Systems.ACM,2012:57-70
[30] Cardosa M,Singh A,Pucha H,et al.Exploiting Spatio-Temporal Tradeoffs for Energy Efficient MapReduce in the Cloud[D].Department of Computer Science and Engineering,University of Minnesota,2010
[31] Chen Y,Ganapathi A,Katz R H.To Compress or Not to Compress-Compute vs.IO Tradeoffs for Mapreduce Energy Efficiency[C]∥Proceedings of the First ACM SIGCOMM Workshop on Green Networking.New Delhi,India,2010:23-28
[32] 宋杰,李甜甜,朱志良,等.云数据管理系统能耗基准测试与分析[J].计算机学报,2013,6(7):1485-1499 Song J,Li T T,Zhu Z L,et al.Benchmarking and analyzing the energy consumption of cloud data management system [J].Chinese Journal of Computers,2013,36(7):1485-1499

No related articles found!
Full text



No Suggested Reading articles found!