计算机科学 ›› 2019, Vol. 46 ›› Issue (11A): 354-358, 386.

• 网络与通信 • 上一篇    下一篇

混合云环境下面向代价优化的工作流数据布局方法

黄引豪1,2, 马郓3, 林兵2,4, 於志勇1,2, 陈星1,2   

  1. (福州大学数学与计算机科学学院 福州350116)1;
    (福建省网络计算与智能信息处理重点实验室 福州350116)2;
    (清华大学软件学院 北京100084)3;
    (福建师范大学物理与能源学院 福州350117)4
  • 出版日期:2019-11-10 发布日期:2019-11-20
  • 通讯作者: 於志勇(1982-),男,博士,副教授,主要研究方向为移动社会网络,E-mail:yuzhiyong@fzu.edu。
  • 作者简介:黄引豪(1996-),男,硕士生,主要研究方向为云计算;马郓(1989-),男,博士,主要研究方向为移动计算、服务计算;林兵(1986-),男,博士,讲师,主要研究方向为云计算技术;陈星(1985-),男,博士,副教授,主要研究方向为软件自适应。
  • 基金资助:
    本文受国家自然科学基金(61772136),福建省自然科学基金(面上项目)(2019J01061386),福建省教育厅中青年教师教育科研项目(JT180098)资助。

Cost-driven Workflow Data Placement Method in Hybrid Cloud Environment

HUANG Yin-hao1,2, MA Yun3, LIN Bing2,4, YU Zhi-yong1,2, CHEN Xing1,2   

  1. (College of Mathematics and Computer Science,Fuzhou University,Fuzhou 350116,China)1;
    (Fujian Provincial Key Laboratory of Network Computing and Intelligent Information Processing,Fuzhou 350116,China)2;
    (School of Software,Tsinghua University,Beijing 100084,China)3;
    (College of Physics and Energy,Fujian Normal University,Fuzhou 350117,China)4
  • Online:2019-11-10 Published:2019-11-20

摘要: 科学工作流在混合云中执行会产生大量的跨数据中心传输,造成严重的传输时延及代价。为了对混合云环境下的科学工作流数据进行合理布局,兼顾公有云和私有云的优势,优化数据布局代价,提出了一种基于遗传粒子群优化混合算法(GAPSO)的数据布局策略。该方法考虑了公有云数据中心与私有云数据中心的不同特点(如存储容量、存储代价等因素以及数据传输时延约束)对传输代价的影响,并结合遗传算法与粒子群优化算法的优点,生成科学工作流的布局策略。实验结果表明,基于GAPSO的数据布局策略能够有效减少混合云中科学工作流运行时的数据布局代价。

关键词: 混合云, 数据布局, 传输时延约束, 代价优化

Abstract: Scientific workflow execution in hybrid cloud will generate a lot of transmission across data centers,resulting in large quantities propagation delay time and cost.In order to make a reasonable data placement of scientific workflow in hybrid cloud environment,it takes into account the advantages of public cloud and private cloud,and optimizes the cost of data placement.A data placement strategy based on genetic algorithm particle swarm optimization (GAPSO) was proposed,which considers the different characteristics between public cloud data centers and private cloud data centers such as capacity and storage cost as well as the influence of propagation delay time constraint on transmission costs and combining the advantages of genetic algorithm and particle swarm optimization algorithm,and data placement stra-tegy for scientific workflows was generated.The experimental results show that the data placement strategy based on GAPSO can effectively reduce the cost of data placement of scientific workflow in hybrid cloud.

Key words: Hybrid cloud, Data placement, Propagation delay time constraint, Cost-driven

中图分类号: 

  • TP338
[1]SZABO C,SHENG Q Z,KROEGER T,et al.Science in theCloud:Allocation and Execution of Data-IntensiveScientific Workflows[J].Journal of Grid Computing,2014,12(2):245-264.
[2]WEISS A.Computing in the clouds[J].Networker,2007,11(4):16-25.
[3]ZHANG X,ZHANG Y,ZHAO X,et al.SmartRelationship:aVM relationship detection framework for cloud management[C]∥Asia-pacific Symposium on Internetware.2014.
[4]CHEN X,ZHANG Y,ZHANG X,et al.Towards runtime model based integrated management of cloud resources[C]∥Asia-pacific Symposium on Internetware.ACM,2013.
[5]ZHANG X,CHEN X,ZHANG Y,et al.Runtime Model Based Management of Diverse Cloud Resources[C]∥International Conference on Model Driven Engineering Languages and Systems.Springer,Berlin,Heidelberg,2013.
[6]HUANG G,CHEN X,ZHANG Y,et al.Towards Architecture-based Management of Platforms in Cloud[J].中国计算机科学前沿(英文版),2012,6(4):388-397.
[7]ABRISHAMI H,REZAEIAN A,TOUSI G K,et al.Scheduling in hybrid cloud to maintain data privacy[C]∥FifthInternational Conference on Innovative Computing Technology.IEEE,2015:83-88.
[8]AN B,ZHANG X,TSUGAWA M,et al.Towards a Model-Defined Cloud-of-Clouds[C]∥Collaboration & Internet Computing.IEEE,2016.
[9]ARMBRUST M,FOX A,GRIFFITH R,et al.Above theclouds:A Berkeley view of cloud computing[R].No.UCB/EECS-2009-28,Berkeley:Department of Electrical Engeering and Computer Sciences,University of California,2009.
[10]FU J,WANG J C,LU J,et al.Research on meteorology indices forecasting framework based on hybrid cloud computingplatforms[C]∥Proc.of the Ubiquitous Information Technologies and Applications.Netherlands:Springer-Verlag,2013:727-735.
[11]陈晓,赵晶玲.大数据处理中混合型聚类算法的研究与实现[J].信息网络安全,2015(4):45-49.
[12]YUAN D,YANG Y,LIU X,et al.A data placement strategy in scientific cloud workflows[J].Future GenerationComputer Systems,2010,26(8):1200-1214.
[13]DENG K,REN K,ZHU M,et al.A Data and Task Co-scheduling Algorithm for Scientific Cloud Workflows[J].IEEE Transactions on Cloud Computing,2015:1-1.
[14]WANG M,ZHANG J,DONG F,et al.Data Placement and Task Scheduling Optimization for Data IntensiveScientific Workflow in Multiple Data Centers Environment[C]∥International Conference on Advanced Cloud & Big Data.IEEE,2014:77-84.
[15]程慧敏,李学俊,吴洋,等.云环境下基于多目标优化的科学工作流数据布局策略[J].计算机应用与软件,2017,34(3):1-6.
[16]王东亮,衣俊艳,李时慧,等.融合负载均衡和蝙蝠算法的云计算任务调度[J].信息网络安全,2017(1):23-28.
[17]ZHAO Q,XIONG C,ZHAO X,et al.A Data Placement Strategy for Data-Intensive Scientific Workflows inCloud[C]∥IEEE/ACM International Symposium on Cluster,Cloud and Grid Computing.IEEE,2015:928-934.
[18]ZHANG X X,HU Z G,ZHENG M G,et al.A novel cloud model based data placement strategy for data-intensive application in clouds[J].Computers and Electrical Engineering,2018.
[19]彭晓波,桂卫华,黄志武,等.GAPSO:一种高效的遗传粒子混合算法及其应用[J].系统仿真学报,2008,20(18):5025-5031.
[20]马小平.私有云存储系统的设计与实现[D].成都:电子科技大学,2014.
[21]林兵,郭文忠,陈国龙.多云环境下带截止日期约束的科学工作流调度策略[J].通信学报,2018,39(1):56-69.
[22]KENNEDY J,EBERHART R.Particle swarm optimization[C]∥IEEE International Conference on Neural Networks.IEEE,2002:1942-1948.
[23]HOLLAND J H.Adaptation in Natural and Artificial Systems[M].Ann Arbor,Michigan:University of Michigan Press,1975.
[24]SHI Y,EBERHART R.A modified particle swarm optimizer[C]∥IEEE International Conference on Evolutionary Computation Proceedings.IEEE World Congress on Computational Intelligence,1998:69-73.
[25]BHARATHI S,CHERVENAK A,DEELMAN E,et al.Characterization of scientific workflows[C]∥Third Workshop onWorkflows in Support of Large-Scale Science.IEEE,2008:1-10.
[26]CUI L,ZHANG J,YUE L,et al.A Genetic Algorithm Based Data Replica Placement Strategy for Scientific Applications in Clouds[J].IEEE Transactions on Services Computing,2015.
[1] 黄冬梅, 杜艳玲, 贺琪, 随宏运, 李瑶. 基于多属性最优化的海洋监测数据副本布局策略[J]. 计算机科学, 2018, 45(6): 72-75,104.
[2] 张桂鹏, 陈平华. 一种混合云环境下基于Merkle哈希树的数据安全去重方案[J]. 计算机科学, 2018, 45(11): 187-192,203.
[3] 汪学舜,余少华,戴锦友. 一种虚拟化深度包检测部署机制[J]. 计算机科学, 2017, 44(8): 90-94.
[4] 缪嘉嘉,付印金,毛捍东. KingCloud:智能对象归档系统[J]. 计算机科学, 2016, 43(Z11): 575-577, 596.
[5] 范菁,沈杰,熊丽荣. 混合云环境中数据敏感工作流调度[J]. 计算机科学, 2015, 42(Z11): 400-405.
[6] 王宗江,郑秋生,曹健. 混合云中的一个高效协调器[J]. 计算机科学, 2015, 42(1): 92-95,105.
[7] 杨敏 王刚 刘璟 陈北莲. 用双目标加权遗传算法解决网络磁盘阵列系统下校验散布布局优化问题的研究[J]. 计算机科学, 2005, 32(5): 73-75.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 杜威, 丁世飞. 多智能体强化学习综述[J]. 计算机科学, 2019, 46(8): 1 -8 .
[2] 高李政, 周刚, 罗军勇, 兰明敬. 元事件抽取研究综述[J]. 计算机科学, 2019, 46(8): 9 -15 .
[3] 蔡莉, 李英姿, 江芳, 梁宇. 面向城市热点区域的不平衡数据聚类挖掘研究[J]. 计算机科学, 2019, 46(8): 16 -22 .
[4] 杨震, 王红军. 基于轨迹划分与密度聚类的移动用户重要地点识别方法[J]. 计算机科学, 2019, 46(8): 23 -27 .
[5] 邓存彬, 虞慧群, 范贵生. 融合动态协同过滤和深度学习的推荐算法[J]. 计算机科学, 2019, 46(8): 28 -34 .
[6] 钟凤艳, 王艳, 李念爽. 异构分布式存储系统再生码数据修复的节点选择方案[J]. 计算机科学, 2019, 46(8): 35 -41 .
[7] 孙国道, 周志秀, 李思, 刘义鹏, 梁荣华. 基于地理标签的推文话题时空演变的可视分析方法[J]. 计算机科学, 2019, 46(8): 42 -49 .
[8] 张会兵, 钟昊, 胡晓丽. 基于主题分析的用户评论聚类方法[J]. 计算机科学, 2019, 46(8): 50 -55 .
[9] 李博嘉, 张仰森, 陈若愚. 一种可指定分布的海量数据生成方法[J]. 计算机科学, 2019, 46(8): 56 -63 .
[10] 鲁显光, 杜学绘, 王文娟. 基于改进FP growth的告警关联算法[J]. 计算机科学, 2019, 46(8): 64 -70 .