计算机科学 ›› 2019, Vol. 46 ›› Issue (10): 202-208.doi: 10.11896/jsjkx.180901623
吴修国, 刘翠
WU Xiu-guo, LIU Cui
摘要: 副本技术是提高云存储系统中数据可靠性访问和系统容错性的常用策略。依据用户需求以及环境变化,及时对数据副本布局进行动态调整,是目前副本管理研究的重要内容之一。然而,现有研究大都以副本布局转换是自动完成的为前提,仅关注于数据副本数目与位置等副本布局方案设计,较少涉及副本布局转换的任务调度问题。事实上,副本布局转换是有关多数据中心数据副本迁移与删除操作的复杂任务调度问题,不同的任务调度策略占用的空间、时间不同,由此导致成本、效率等存在较大差异。基于此,首先给出云存储系统中面向多数据中心的数据副本布局转换任务调度模型,以及该问题的可行性分析。然后,从降低成本的角度给出最小开销的数据副本布局转换任务调度问题的定义,并基于0-1背包问题证明其是NP完全的。在此基础上,给出随机(Random)、最小传输开销优先(MTCF)、最大机会成本优先(MOCF)以及同数据最小传输成本优先(MTCFSD)等副本布局转换任务调度策略。最后,以CloudSim为仿真平台进行了模拟实验,结果表明,最小开销的数据副本布局转换策略与同类算法相比,在传输次数上减少了约60%,相对开销降低了约50%,证明了转换策略的可靠性与有效性,从而进一步提升了云存储系统的性能。
中图分类号:
[1]SUN X,LI Q Z,ZHAO P,et al.An optimized replica distribution method for peer-to-peer network[J].Chinese Journal of Computers,2014,37(6):1424-1434.(in Chinese) 孙新,李庆洲,赵璞,等.对等网络中一种优化的副本分布方法[J].计算机学报,2014,37(6):1424-1434. [2]POLATO I,RÉ R,GOLDMAN A,et al.A comprehensive view of Hadoop research-a systematic literature review[J].Journal of Network & Computer Applications,2014,46:1-25. [3]HAMROUNI T,SLIMANI S,CHARRADA F B.A survey of dynamic replication and replica selection strategies based on data mining techniques in data grids[J].Engineering Applications of Artificial Intelligence,2016,48(C):140-158. [4]WU X.Data sets replicas placements strategy from cost-effective view in the cloud[J].Scientific Programming,2016(11):1-13. [5]SAADAT N,RAHMANI A M.PDDRA:A new pre-fetching based dynamic data replication algorithm in data grids[J].Future Generation Computer Systems,2012,28(4):666-681. [6]YANG X L,QIAN C,ZHU F X.Evaluation method of big data service based on cloud computing[J].Computer Science,2018,45(5):295-299.(in Chinese) 阳小兰,钱程,朱福喜.基于云计算的大数据服务资源评价方法[J].计算机科学,2018,45(5):295-299. [7]LIU W,PENG S,DU W,et al.Security-aware intermediate data placement strategy in scientific cloud workflows[J].Knowledge &Information Systems,2014,41(2):423-447. [8]LI W,YANG Y,YUAN D.Ensuring Cloud data reliability with minimum replication by proactive replica checking[J].IEEE Transactions on Computers,2016,65(5):1494-1506. [9]GRACE R K,MANIMEGALAI R.Dynamic replica placement and selection strategies in data grids-A comprehensive survey[J].Journal of Parallel & Distributed Computing,2014,74(2):2099-2108. [10]WU X G.Minimum cost data replicas distribution based on dynamic programming in the cloud storage system[J].Computer Engineering,2017,43(7):29-37.(in Chinese) 吴修国.云存储系统中基于动态规划的最小开销数据副本布局研究[J].计算机工程,2017,43(7):29-37. [11]LIU G,LI J,XU J.An Improved Min-Min Algorithm in Cloud Computing[C]//Proceedings of the 2012 International Conference of Modern Computer Science and Applications.Springer Berlin Heidelberg,2013:47-52. [12]MA J,LI W,FU T,et al.A Novel Dynamic Task Scheduling Algorithm Based on Improved Genetic Algorithm in Cloud Computing[M]//Wireless Communications,Networking and Applications.Springer India,2016:184-186. [13]TANG Y T,HUANG J,XIAO Q.Task scheduling algorithm for MapReduce based on DAG[J].Computer Science,2014,41(S1):42-46.(in Chinese) 唐一韬,黄晶,肖球.一种基于DAG的MapReduce任务调度算法[J].计算机科学,2014,41(S1):42-46. [14]KHULLER S,KIM Y A,MALEKIAN A.Improved Approximation Algorithms for Data Migration[J].Algorithmica,2012,63(1/2):347-362. [15]LU P,ZHANG L,LIU X,et al.Highly efficient data migration and backup for big data applications in elastic optical inter-data-center networks[J].Network IEEE,2015,29(5):36-42. [16]ANDRONIKOU V,MAMOURAS K,TSERPES K,et al.Dy-namic Qos-aware data replication in grid environments based on data “importance”[J].Future Generation Computer Systems,2012,28(3):544-553. [17]ATENIESE G,BURNS R,CURTMOLA R,et al.Provable data possession at untrusted stores[C]//Proceedings of ACM Conference on Computer and Communications Security.New York:ACM Press,2007:598-609. [18]LOUKOPOULOS T,TZIRITAS N,LAMPSAS P,et al.Implementing replica placements:feasibility and cost minimization[C]//Proceedings of Parallel and Distributed Processing Symposium.New York:IEEE Press,2007:1-10. [19]CALHEIROS R N,RANJAN R,BELOGLAZOV A,et al. Cloudsim:a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms[J].Software Practice & Experience,2011,41(1):23-50. |
[1] | 邵子灏, 杨世宇, 马国杰. 室内信息服务的基础——低成本定位技术研究综述 Foundation of Indoor Information Services:A Survey of Low-cost Localization Techniques 计算机科学, 2022, 49(9): 228-235. https://doi.org/10.11896/jsjkx.210900260 |
[2] | 严磊, 张功萱, 王添, 寇小勇, 王国洪. 混合云下具有交付期约束的众包任务调度算法 Scheduling Algorithm for Bag-of-Tasks with Due Date Constraints on Hybrid Clouds 计算机科学, 2022, 49(5): 244-249. https://doi.org/10.11896/jsjkx.210300120 |
[3] | 左园林, 龚月姣, 陈伟能. 成本受限条件下的社交网络影响最大化方法 Budget-aware Influence Maximization in Social Networks 计算机科学, 2022, 49(4): 100-109. https://doi.org/10.11896/jsjkx.210300228 |
[4] | 田冰川, 田臣, 周宇航, 陈贵海, 窦万春. 减少Hadoop集群中网络队头阻塞的调度算法 Reducing Head-of-Line Blocking on Network in Hadoop Clusters 计算机科学, 2022, 49(3): 11-22. https://doi.org/10.11896/jsjkx.210900117 |
[5] | 谭双杰, 林宝军, 刘迎春, 赵帅. 基于机器学习的分布式星载RTs系统负载调度算法 Load Scheduling Algorithm for Distributed On-board RTs System Based on Machine Learning 计算机科学, 2022, 49(2): 336-341. https://doi.org/10.11896/jsjkx.201200126 |
[6] | 沈彪, 沈立炜, 李弋. 空间众包任务的路径动态调度方法 Dynamic Task Scheduling Method for Space Crowdsourcing 计算机科学, 2022, 49(2): 231-240. https://doi.org/10.11896/jsjkx.210400249 |
[7] | 薛艳芬, 高继梅, 范贵生, 虞慧群, 许亚杰. 边缘计算中基于能耗感知的容错协同任务执行算法 Energy-aware Fault-tolerant Collaborative Task Execution Algorithm in Edge Computing 计算机科学, 2021, 48(6A): 374-382. https://doi.org/10.11896/jsjkx.200900027 |
[8] | 王政, 姜春茂. 一种基于三支决策的云任务调度优化算法 Cloud Task Scheduling Algorithm Based on Three-way Decisions 计算机科学, 2021, 48(6A): 420-426. https://doi.org/10.11896/jsjkx.201000023 |
[9] | 蔡凌峰, 魏祥麟, 邢长友, 邹霞, 张国敏. 故障场景下的边缘计算DAG任务重调度方法 Failure-resilient DAG Task Rescheduling in Edge Computing 计算机科学, 2021, 48(10): 334-342. https://doi.org/10.11896/jsjkx.210300304 |
[10] | 张龙信, 周立前, 文鸿, 肖满生, 邓晓军. 基于异构云计算的成本约束下的工作流能量高效调度算法 Energy Efficient Scheduling Algorithm of Workflows with Cost Constraint in Heterogeneous Cloud Computing Systems 计算机科学, 2020, 47(8): 112-118. https://doi.org/10.11896/jsjkx.200300038 |
[11] | 刘静, 方贤文. 基于成本对齐的业务流程变化挖掘方法 Mining Method of Business Process Change Based on Cost Alignment 计算机科学, 2020, 47(7): 78-83. https://doi.org/10.11896/jsjkx.190600140 |
[12] | 高子妍, 王勇. 面向云服务的分布式消息系统负载均衡策略 Load Balancing Strategy of Distributed Messaging System for Cloud Services 计算机科学, 2020, 47(6A): 318-324. https://doi.org/10.11896/JsJkx.191100012 |
[13] | 孙敏, 陈中雄, 叶侨楠. 云环境下基于HEDSM的工作流调度策略 Workflow Scheduling Strategy Based on HEDSM Under Cloud Environment 计算机科学, 2020, 47(6): 252-259. https://doi.org/10.11896/jsjkx.190400047 |
[14] | 洪海诚,陈丹伟. 基于RBEC的副本动态存储方法 Replica Dynamic Storage Based on RBEC 计算机科学, 2020, 47(2): 313-319. https://doi.org/10.11896/jsjkx.181102161 |
[15] | 胡俊钦, 张佳俊, 黄引豪, 陈星, 林兵. 边缘环境下DNN应用的计算迁移调度技术 Computation Offloading Scheduling Technology for DNN Applications in Edge Environment 计算机科学, 2020, 47(10): 247-255. https://doi.org/10.11896/jsjkx.190900106 |
|