Computer Science ›› 2019, Vol. 46 ›› Issue (11A): 354-358, 386.

• Network & Communication • Previous Articles     Next Articles

Cost-driven Workflow Data Placement Method in Hybrid Cloud Environment

HUANG Yin-hao1,2, MA Yun3, LIN Bing2,4, YU Zhi-yong1,2, CHEN Xing1,2   

  1. (College of Mathematics and Computer Science,Fuzhou University,Fuzhou 350116,China)1;
    (Fujian Provincial Key Laboratory of Network Computing and Intelligent Information Processing,Fuzhou 350116,China)2;
    (School of Software,Tsinghua University,Beijing 100084,China)3;
    (College of Physics and Energy,Fujian Normal University,Fuzhou 350117,China)4
  • Online:2019-11-10 Published:2019-11-20

Abstract: Scientific workflow execution in hybrid cloud will generate a lot of transmission across data centers,resulting in large quantities propagation delay time and cost.In order to make a reasonable data placement of scientific workflow in hybrid cloud environment,it takes into account the advantages of public cloud and private cloud,and optimizes the cost of data placement.A data placement strategy based on genetic algorithm particle swarm optimization (GAPSO) was proposed,which considers the different characteristics between public cloud data centers and private cloud data centers such as capacity and storage cost as well as the influence of propagation delay time constraint on transmission costs and combining the advantages of genetic algorithm and particle swarm optimization algorithm,and data placement stra-tegy for scientific workflows was generated.The experimental results show that the data placement strategy based on GAPSO can effectively reduce the cost of data placement of scientific workflow in hybrid cloud.

Key words: Hybrid cloud, Data placement, Propagation delay time constraint, Cost-driven

CLC Number: 

  • TP338
[1]SZABO C,SHENG Q Z,KROEGER T,et al.Science in theCloud:Allocation and Execution of Data-IntensiveScientific Workflows[J].Journal of Grid Computing,2014,12(2):245-264.
[2]WEISS A.Computing in the clouds[J].Networker,2007,11(4):16-25.
[3]ZHANG X,ZHANG Y,ZHAO X,et al.SmartRelationship:aVM relationship detection framework for cloud management[C]∥Asia-pacific Symposium on Internetware.2014.
[4]CHEN X,ZHANG Y,ZHANG X,et al.Towards runtime model based integrated management of cloud resources[C]∥Asia-pacific Symposium on Internetware.ACM,2013.
[5]ZHANG X,CHEN X,ZHANG Y,et al.Runtime Model Based Management of Diverse Cloud Resources[C]∥International Conference on Model Driven Engineering Languages and Systems.Springer,Berlin,Heidelberg,2013.
[6]HUANG G,CHEN X,ZHANG Y,et al.Towards Architecture-based Management of Platforms in Cloud[J].中国计算机科学前沿(英文版),2012,6(4):388-397.
[7]ABRISHAMI H,REZAEIAN A,TOUSI G K,et al.Scheduling in hybrid cloud to maintain data privacy[C]∥FifthInternational Conference on Innovative Computing Technology.IEEE,2015:83-88.
[8]AN B,ZHANG X,TSUGAWA M,et al.Towards a Model-Defined Cloud-of-Clouds[C]∥Collaboration & Internet Computing.IEEE,2016.
[9]ARMBRUST M,FOX A,GRIFFITH R,et al.Above theclouds:A Berkeley view of cloud computing[R].No.UCB/EECS-2009-28,Berkeley:Department of Electrical Engeering and Computer Sciences,University of California,2009.
[10]FU J,WANG J C,LU J,et al.Research on meteorology indices forecasting framework based on hybrid cloud computingplatforms[C]∥Proc.of the Ubiquitous Information Technologies and Applications.Netherlands:Springer-Verlag,2013:727-735.
[12]YUAN D,YANG Y,LIU X,et al.A data placement strategy in scientific cloud workflows[J].Future GenerationComputer Systems,2010,26(8):1200-1214.
[13]DENG K,REN K,ZHU M,et al.A Data and Task Co-scheduling Algorithm for Scientific Cloud Workflows[J].IEEE Transactions on Cloud Computing,2015:1-1.
[14]WANG M,ZHANG J,DONG F,et al.Data Placement and Task Scheduling Optimization for Data IntensiveScientific Workflow in Multiple Data Centers Environment[C]∥International Conference on Advanced Cloud & Big Data.IEEE,2014:77-84.
[17]ZHAO Q,XIONG C,ZHAO X,et al.A Data Placement Strategy for Data-Intensive Scientific Workflows inCloud[C]∥IEEE/ACM International Symposium on Cluster,Cloud and Grid Computing.IEEE,2015:928-934.
[18]ZHANG X X,HU Z G,ZHENG M G,et al.A novel cloud model based data placement strategy for data-intensive application in clouds[J].Computers and Electrical Engineering,2018.
[22]KENNEDY J,EBERHART R.Particle swarm optimization[C]∥IEEE International Conference on Neural Networks.IEEE,2002:1942-1948.
[23]HOLLAND J H.Adaptation in Natural and Artificial Systems[M].Ann Arbor,Michigan:University of Michigan Press,1975.
[24]SHI Y,EBERHART R.A modified particle swarm optimizer[C]∥IEEE International Conference on Evolutionary Computation Proceedings.IEEE World Congress on Computational Intelligence,1998:69-73.
[25]BHARATHI S,CHERVENAK A,DEELMAN E,et al.Characterization of scientific workflows[C]∥Third Workshop onWorkflows in Support of Large-Scale Science.IEEE,2008:1-10.
[26]CUI L,ZHANG J,YUE L,et al.A Genetic Algorithm Based Data Replica Placement Strategy for Scientific Applications in Clouds[J].IEEE Transactions on Services Computing,2015.
[1] ZHANG Gui-peng, CHEN Ping-hua. Secure Data Deduplication Scheme Based on Merkle Hash Tree in HybridCloud Storage Environments [J]. Computer Science, 2018, 45(11): 187-192,203.
[2] LIU Fei, JIANG De-jun, ZHANG Huan, CHEN Jing, WANG Jun and XIONG Jin. Heterogeneous Storage Aware Data Placement of Ceph Storage System [J]. Computer Science, 2017, 44(6): 17-22.
[3] FAN Jing, SHEN Jie and XIONG Li-rong. Scheduling Data Sensitive Workflow in Hybrid Cloud [J]. Computer Science, 2015, 42(Z11): 400-405.
[4] ZANG Ji-kun and YU Jian. Traffic Surveillance Video Storage in HDFS Based on Event Density [J]. Computer Science, 2015, 42(5): 221-224, 229.
[5] WANG Zong-jiang, ZHENG Qiu-sheng and CAO Jian. Efficient Coordinator in Hybrid Cloud [J]. Computer Science, 2015, 42(1): 92-95,105.
[6] ZHANG Gui-gang. A kind of Big Data Placement Method [J]. Computer Science, 2014, 41(6): 1-4,36.
[7] ZHENG Sheng and LI Tong. Data Placement Algorithm for Large-scale Storage System [J]. Computer Science, 2013, 40(Z11): 270-273.
[8] LUO Xiang-yu,WANG Yun and CHEN Xiao-mei. Evaluation and Analysis of Load Balancing Mechanisms in Storage Systems [J]. Computer Science, 2013, 40(9): 55-60.
[9] GUO Pan-hong,YANG yang,LI Xin-you. Data Placement Scheme for Distributed Caching System in Media Streaming Service [J]. Computer Science, 2009, 36(11): 56-60.
Full text



[1] DU Wei, DING Shi-fei. Overview on Multi-agent Reinforcement Learning[J]. Computer Science, 2019, 46(8): 1 -8 .
[2] GAO Li-zheng, ZHOU Gang, LUO Jun-yong, LAN Ming-jing. Survey on Meta-event Extraction[J]. Computer Science, 2019, 46(8): 9 -15 .
[3] CAI Li, LI Ying-zi, JIANG Fang, LIANG Yu. Study on Clustering Mining of Imbalanced Data Fusion Towards Urban Hotspots[J]. Computer Science, 2019, 46(8): 16 -22 .
[4] YANG Zhen, WANG Hong-jun. Important Location Identification of Mobile Users Based on Trajectory Division and Density Clustering Method[J]. Computer Science, 2019, 46(8): 23 -27 .
[5] DENG Cun-bin, YU Hui-qun, FAN Gui-sheng. Integrating Dynamic Collaborative Filtering and Deep Learning for Recommendation[J]. Computer Science, 2019, 46(8): 28 -34 .
[6] ZHONG Feng-yan, WANG Yan, LI Nian-shuang. Node Selection Scheme for Data Repair in Heterogeneous Distributed Storage Systems[J]. Computer Science, 2019, 46(8): 35 -41 .
[7] SUN Guo-dao, ZHOU Zhi-xiu, LI Si, LIU Yi-peng, LIANG Rong-hua. Spatio-Temporal Evolution of Geographical Topics[J]. Computer Science, 2019, 46(8): 42 -49 .
[8] ZHANG Hui-bing, ZHONG Hao, HU Xiao-li. User Reviews Clustering Method Based on Topic Analysis[J]. Computer Science, 2019, 46(8): 50 -55 .
[9] LI Bo-jia, ZHANG Yang-sen, CHEN Ruo-yu. Method for Generating Massive Data with Assignable Distribution[J]. Computer Science, 2019, 46(8): 56 -63 .
[10] LU Xian-guang, DU Xue-hui, WANG Wen-juan. Alert Correlation Algorithm Based on Improved FP Growth[J]. Computer Science, 2019, 46(8): 64 -70 .