Computer Science ›› 2019, Vol. 46 ›› Issue (11A): 354-358.

• Network & Communication • Previous Articles     Next Articles

Cost-driven Workflow Data Placement Method in Hybrid Cloud Environment

HUANG Yin-hao1,2, MA Yun3, LIN Bing2,4, YU Zhi-yong1,2, CHEN Xing1,2   

  1. (College of Mathematics and Computer Science,Fuzhou University,Fuzhou 350116,China)1;
    (Fujian Provincial Key Laboratory of Network Computing and Intelligent Information Processing,Fuzhou 350116,China)2;
    (School of Software,Tsinghua University,Beijing 100084,China)3;
    (College of Physics and Energy,Fujian Normal University,Fuzhou 350117,China)4
  • Online:2019-11-10 Published:2019-11-20

Abstract: Scientific workflow execution in hybrid cloud will generate a lot of transmission across data centers,resulting in large quantities propagation delay time and cost.In order to make a reasonable data placement of scientific workflow in hybrid cloud environment,it takes into account the advantages of public cloud and private cloud,and optimizes the cost of data placement.A data placement strategy based on genetic algorithm particle swarm optimization (GAPSO) was proposed,which considers the different characteristics between public cloud data centers and private cloud data centers such as capacity and storage cost as well as the influence of propagation delay time constraint on transmission costs and combining the advantages of genetic algorithm and particle swarm optimization algorithm,and data placement stra-tegy for scientific workflows was generated.The experimental results show that the data placement strategy based on GAPSO can effectively reduce the cost of data placement of scientific workflow in hybrid cloud.

Key words: Cost-driven, Data placement, Hybrid cloud, Propagation delay time constraint

CLC Number: 

  • TP338
[1]SZABO C,SHENG Q Z,KROEGER T,et al.Science in theCloud:Allocation and Execution of Data-IntensiveScientific Workflows[J].Journal of Grid Computing,2014,12(2):245-264.
[2]WEISS A.Computing in the clouds[J].Networker,2007,11(4):16-25.
[3]ZHANG X,ZHANG Y,ZHAO X,et al.SmartRelationship:aVM relationship detection framework for cloud management[C]∥Asia-pacific Symposium on Internetware.2014.
[4]CHEN X,ZHANG Y,ZHANG X,et al.Towards runtime model based integrated management of cloud resources[C]∥Asia-pacific Symposium on Internetware.ACM,2013.
[5]ZHANG X,CHEN X,ZHANG Y,et al.Runtime Model Based Management of Diverse Cloud Resources[C]∥International Conference on Model Driven Engineering Languages and Systems.Springer,Berlin,Heidelberg,2013.
[6]HUANG G,CHEN X,ZHANG Y,et al.Towards Architecture-based Management of Platforms in Cloud[J].中国计算机科学前沿(英文版),2012,6(4):388-397.
[7]ABRISHAMI H,REZAEIAN A,TOUSI G K,et al.Scheduling in hybrid cloud to maintain data privacy[C]∥FifthInternational Conference on Innovative Computing Technology.IEEE,2015:83-88.
[8]AN B,ZHANG X,TSUGAWA M,et al.Towards a Model-Defined Cloud-of-Clouds[C]∥Collaboration & Internet Computing.IEEE,2016.
[9]ARMBRUST M,FOX A,GRIFFITH R,et al.Above theclouds:A Berkeley view of cloud computing[R].No.UCB/EECS-2009-28,Berkeley:Department of Electrical Engeering and Computer Sciences,University of California,2009.
[10]FU J,WANG J C,LU J,et al.Research on meteorology indices forecasting framework based on hybrid cloud computingplatforms[C]∥Proc.of the Ubiquitous Information Technologies and Applications.Netherlands:Springer-Verlag,2013:727-735.
[11]陈晓,赵晶玲.大数据处理中混合型聚类算法的研究与实现[J].信息网络安全,2015(4):45-49.
[12]YUAN D,YANG Y,LIU X,et al.A data placement strategy in scientific cloud workflows[J].Future GenerationComputer Systems,2010,26(8):1200-1214.
[13]DENG K,REN K,ZHU M,et al.A Data and Task Co-scheduling Algorithm for Scientific Cloud Workflows[J].IEEE Transactions on Cloud Computing,2015:1-1.
[14]WANG M,ZHANG J,DONG F,et al.Data Placement and Task Scheduling Optimization for Data IntensiveScientific Workflow in Multiple Data Centers Environment[C]∥International Conference on Advanced Cloud & Big Data.IEEE,2014:77-84.
[15]程慧敏,李学俊,吴洋,等.云环境下基于多目标优化的科学工作流数据布局策略[J].计算机应用与软件,2017,34(3):1-6.
[16]王东亮,衣俊艳,李时慧,等.融合负载均衡和蝙蝠算法的云计算任务调度[J].信息网络安全,2017(1):23-28.
[17]ZHAO Q,XIONG C,ZHAO X,et al.A Data Placement Strategy for Data-Intensive Scientific Workflows inCloud[C]∥IEEE/ACM International Symposium on Cluster,Cloud and Grid Computing.IEEE,2015:928-934.
[18]ZHANG X X,HU Z G,ZHENG M G,et al.A novel cloud model based data placement strategy for data-intensive application in clouds[J].Computers and Electrical Engineering,2018.
[19]彭晓波,桂卫华,黄志武,等.GAPSO:一种高效的遗传粒子混合算法及其应用[J].系统仿真学报,2008,20(18):5025-5031.
[20]马小平.私有云存储系统的设计与实现[D].成都:电子科技大学,2014.
[21]林兵,郭文忠,陈国龙.多云环境下带截止日期约束的科学工作流调度策略[J].通信学报,2018,39(1):56-69.
[22]KENNEDY J,EBERHART R.Particle swarm optimization[C]∥IEEE International Conference on Neural Networks.IEEE,2002:1942-1948.
[23]HOLLAND J H.Adaptation in Natural and Artificial Systems[M].Ann Arbor,Michigan:University of Michigan Press,1975.
[24]SHI Y,EBERHART R.A modified particle swarm optimizer[C]∥IEEE International Conference on Evolutionary Computation Proceedings.IEEE World Congress on Computational Intelligence,1998:69-73.
[25]BHARATHI S,CHERVENAK A,DEELMAN E,et al.Characterization of scientific workflows[C]∥Third Workshop onWorkflows in Support of Large-Scale Science.IEEE,2008:1-10.
[26]CUI L,ZHANG J,YUE L,et al.A Genetic Algorithm Based Data Replica Placement Strategy for Scientific Applications in Clouds[J].IEEE Transactions on Services Computing,2015.
[1] LIU Peng, LIU Bo, ZHOU Na-qin, PENG Xin-yi, LIN Wei-wei. Survey of Hybrid Cloud Workflow Scheduling [J]. Computer Science, 2022, 49(5): 235-243.
[2] YAN Lei, ZHANG Gong-xuan, WANG Tian, KOU Xiao-yong, WANG Guo-hong. Scheduling Algorithm for Bag-of-Tasks with Due Date Constraints on Hybrid Clouds [J]. Computer Science, 2022, 49(5): 244-249.
[3] JI Yan, DAI Hua, JIANG Ying-ying, YANG Geng, Yi Xun. Parallel Multi-keyword Top-k Search Scheme over Encrypted Data in Hybrid Clouds [J]. Computer Science, 2021, 48(5): 320-327.
[4] LIU Zhang-hui, ZHAO Xu, LIN Bing, CHEN Xing. Data Placement Strategy of Scientific Workflow Based on Fuzzy Theory in Hybrid Cloud [J]. Computer Science, 2021, 48(11): 199-207.
[5] ZHANG Gui-peng, CHEN Ping-hua. Secure Data Deduplication Scheme Based on Merkle Hash Tree in HybridCloud Storage Environments [J]. Computer Science, 2018, 45(11): 187-192.
[6] LIU Fei, JIANG De-jun, ZHANG Huan, CHEN Jing, WANG Jun and XIONG Jin. Heterogeneous Storage Aware Data Placement of Ceph Storage System [J]. Computer Science, 2017, 44(6): 17-22.
[7] FAN Jing, SHEN Jie and XIONG Li-rong. Scheduling Data Sensitive Workflow in Hybrid Cloud [J]. Computer Science, 2015, 42(Z11): 400-405.
[8] ZANG Ji-kun and YU Jian. Traffic Surveillance Video Storage in HDFS Based on Event Density [J]. Computer Science, 2015, 42(5): 221-224.
[9] WANG Zong-jiang, ZHENG Qiu-sheng and CAO Jian. Efficient Coordinator in Hybrid Cloud [J]. Computer Science, 2015, 42(1): 92-95.
[10] ZHANG Gui-gang. A kind of Big Data Placement Method [J]. Computer Science, 2014, 41(6): 1-4.
[11] ZHENG Sheng and LI Tong. Data Placement Algorithm for Large-scale Storage System [J]. Computer Science, 2013, 40(Z11): 270-273.
[12] LUO Xiang-yu,WANG Yun and CHEN Xiao-mei. Evaluation and Analysis of Load Balancing Mechanisms in Storage Systems [J]. Computer Science, 2013, 40(9): 55-60.
[13] GUO Pan-hong,YANG yang,LI Xin-you. Data Placement Scheme for Distributed Caching System in Media Streaming Service [J]. Computer Science, 2009, 36(11): 56-60.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!