Computer Science ›› 2021, Vol. 48 ›› Issue (11): 199-207.doi: 10.11896/jsjkx.200900009

• Database & Big Data & Data Science • Previous Articles     Next Articles

Data Placement Strategy of Scientific Workflow Based on Fuzzy Theory in Hybrid Cloud

LIU Zhang-hui1,2, ZHAO Xu1,2, LIN Bing2,3, CHEN Xing1,2   

  1. 1 College of Mathematics and Computer Science,Fuzhou University,Fuzhou 350116,China
    2 Fujian Key Laboratory of Network Computing and Intelligent Information Processing,Fuzhou 350116,China
    3 College of Physics and Energy,Fujian Normal University,Fuzhou 350117,China
  • Received:2020-09-01 Revised:2020-12-06 Online:2021-11-15 Published:2021-11-10
  • About author:LIU Zhang-hui,born in 1971,master,associate professor,postgraduate supervisor,is a member of China Computer Federation.His main research interests include big data technology and intelligent computing.
    LIN Bing,born in 1986,Ph.D,lecturer,postgraduate supervisor,is a member of China Computer Federation.His main research interests include cloud computing and intelligent computing and its application.
  • Supported by:
    National Key R & D Program of China(2018YFB1004800) and Guiding Project of Fujian Province(2018H0017).

Abstract: A reasonable data placement strategy is essential to the efficient execution of scientific workflow in hybrid cloud environment.The traditional data placement strategy mainly focuses on the deterministic environment,but the data transmission time is uncertain due to the different load,bandwidth fluctuation and network congestion between different data centers and computer characteristics in the actual network environment.To solve this problem,a fuzzy adaptive discrete particle swarm optimization algorithm based on the fuzzy theory and genetic algorithm operator (FGA-DPSO) is proposed to minimize the fuzzy transmission time of data,place the scientific workflow data reasonably and meet the privacy requirements of the data set and the capacity limit of the data center.The experimental results show that the algorithm can effectively reduce the fuzzy data transmission time of scientific workflow in hybrid cloud environment.

Key words: Data placement, Fuzzy theory, Hybrid cloud, Scientific workflow, Time optimization

CLC Number: 

  • TP338
[1]WEISS A.Computing in the clouds[J].Networker,2007,11(4):16-25.
[2]ABRISHAMI H R,REZAEIAN A,TOUSI G K,et al.Scheduling in hybrid cloud to maintain data privacy[C]//Proceedings of the 5th International Conference on the Innovative Computing Technology (INTECH 2015).Piscataway:IEEE,2015:83-88.
[3]ZHAO Z,BELLOUM A,BUBAK M.Special section on workflow systems and applications in e-Science[J].Future Generation Computer Systems,2009,25(5):525-527.
[4]SZABO C,SHENG Q Z,KROEGER T,et al.Science in the Cloud:Allocation and execution of data-intensive scientific workflows[J].Journal of Grid Computing,2014,12(2):245-264.
[5]LI X,ZHANG L,WU Y,et al.A novel workflow-level data placement strategy for data-sharing scientific cloud workflows[J].IEEE Transactions on Services Computing,2019,12(3):370-383.
[6]ZHONG J,YANG Q,GAO W.Dynamic Scheduling Algorithm for Scalable Big Data Stream in Internet of Things[J].Journal of Chongqing University of Technology(Natural Science),2019,33(9):182-189.
[7]SHANG L,LIU X P.Scientific Workflow Dataset Layout Based on Task Assignment and Dataset Replicas[J].Computer Engineering,2020,46(5):122-130,138.
[8]ZHANG L,ZHOU L,WEN H,et al.Energy Efficient Scheduling Algorithm of Workflows with Cost Constraint in Heterogeneous Cloud Computing Systems[J].Computer Science,2020,47(8):112-118.
[9]YUAN D,YANG Y,LIU X,et al.A data placement strategy in scientific cloud workflows[J].Future Generation Computer Systems,2010,26(8):1200-1214.
[10]HUANG D M,DU Y L,HE Q,et al.Marine Monitoring Data Replica Layout Strategy Based on Multiple Attribute Optimization[J].Computer Science,2018,45(6):72-75,104.
[11]CUI L,ZHANG J,YUE L,et al.A genetic algorithm based data replica placement strategy for scientific applications in clouds[J].IEEE Transactions on Services Computing,2018,11(4):727-739.
[12]LIU S W,KONG L M,REN K J,et al.A two-step data placement and task scheduling strategy for optimizing scientific workflow performance on cloud computing platform[J].Chinese Journal of Computers,2011,34(11):2121-2130.
[13]DENG K,REN K,ZHU M,et al.A data and task co-scheduling algorithm for scientific cloud workflows[J].IEEE Transactions on Cloud Computing,2015,8(2):349-362.
[14]ZHAO Q,XIONG C,ZHAO X,et al.A Data Placement Strategy for Data-Intensive Scientific Workflows in Cloud[C]//Proceedings of the 15th IEEE/ACM International Symposium on Cluster,Cloud and Grid Computing.Piscataway:IEEE,2015:928-934.
[15]BHATTACHARYA H,CHATTOPADHYAY S,CHATTO-PADHYAY M.Problems with Replica Placement Using Data Dependency in Scientific Cloud Workflow[C]//Proceedings of the 5th International Conference on Emerging Applications of Information Technology (EAIT).Piscataway:IEEE,2018.
[16]HUANG Y H,MA Y,LIN B,et al.Cost-driven Workflow Data Placement Method in Hybrid Cloud Environment[J].Computer Science,2019,46(11A):354-358,386.
[17]HAROONABADI A,TESHNEHLAB M.Behavior modeling in uncertain information systems by fuzzy-UML[J].International Journal of Soft Computing,2009,4(1):32-38.
[18]ZADEH L A.Fuzzy sets[J].Information and Control,1965,8(3):338-353.
[19]JUAN J P,MIGUEL A J,CAMINO R V,et al.Genetic tabusearch for the fuzzy flexible job shop problem[J].Computers & Operations Research,2015,54(C):74-89.
[20]LEI D.Fuzzy job shop scheduling problem with availability constraints[J].Computers Industrial Engineering,2010,58(4):610-617.
[21]CANG P,WANG S.The analysis of uncertain knowledge based on meaning of information[J].WSEAS Transactions on Information Science and Applications,2009,6(1):136-145.
[22]SAKAWA M,KUBOTA R.Fuzzy programming for multiobjective job shop scheduling with fuzzy processing time and fuzzy duedate through genetic algorithms[J].European Journal of Operational Research,2000,120(2):393-407.
[23]LEE E S,LI R J.Comparison of fuzzy numbers based on theprobability measure of fuzzy events[J].Computers & Mathematics with Applications,1988,15(10):887-896.
[24]DENG K,REN K,SONG J,et al.A clustering based coschedu-ling strategy for efficient scientific workflow execution in cloud computing[J].Concurrency & Computation Practice & Expe-rience,2014,25(18):2523-2539.
[25]KENNEDY J,EBERHART R.Particle swarm optimization[C]//Proceedings of IEEE International Conference on Neural Networks.Piscataway:IEEE,1995:1942-1948.
[26]MASDARI M,SALEHI F,JALALI M,et al.A survey of PSO-based scheduling algorithms in cloud computing[J].Journal of Network & Systems Management,2017,25(1):122-158.
[27]SHI Y,EBERHART R C.A modified particle swarm optimizer[C]//Proceedings of IEEE International Conference on Evolutionary Computation.Piscataway:IEEE,1998:69-73.
[28]BHARATHI S,CHERVENAK A,DEELMAN E,et al.Characterization of scientific workflows[C]//Proceedings of Workshop on Workflows in Support of Large-scale Science.Piscataway:IEEE,2008.
[1] WU Gong-xing, Sun Zhao-yang, JU Chun-hua. Closed-loop Supply Chain Network Design Model Considering Interruption Risk and Fuzzy Pricing [J]. Computer Science, 2022, 49(7): 220-225.
[2] LIU Peng, LIU Bo, ZHOU Na-qin, PENG Xin-yi, LIN Wei-wei. Survey of Hybrid Cloud Workflow Scheduling [J]. Computer Science, 2022, 49(5): 235-243.
[3] YAN Lei, ZHANG Gong-xuan, WANG Tian, KOU Xiao-yong, WANG Guo-hong. Scheduling Algorithm for Bag-of-Tasks with Due Date Constraints on Hybrid Clouds [J]. Computer Science, 2022, 49(5): 244-249.
[4] CHEN Hai-biao, HUANG Sheng-yong, CAI Jie-rui. Trust Evaluation Protocol for Cross-layer Routing Based on Smart Grid [J]. Computer Science, 2021, 48(6A): 491-497.
[5] JI Yan, DAI Hua, JIANG Ying-ying, YANG Geng, Yi Xun. Parallel Multi-keyword Top-k Search Scheme over Encrypted Data in Hybrid Clouds [J]. Computer Science, 2021, 48(5): 320-327.
[6] MU Xiao-fang, DENG Hong-xia, LI Xiao-bin, ZHAO Peng. Two-phase Image Steganalysis Algorithm Based on Artificial Bee Colony Algorithm [J]. Computer Science, 2019, 46(6): 174-179.
[7] HUANG Yin-hao, MA Yun, LIN Bing, YU Zhi-yong, CHEN Xing. Cost-driven Workflow Data Placement Method in Hybrid Cloud Environment [J]. Computer Science, 2019, 46(11A): 354-358.
[8] XU Jian-rui, ZHU Hui-juan. Coevolutionary Genetic Algorithm of Cloud Workflow Scheduling Based on Adaptive Penalty Function [J]. Computer Science, 2018, 45(8): 105-112.
[9] ZHANG Gui-peng, CHEN Ping-hua. Secure Data Deduplication Scheme Based on Merkle Hash Tree in HybridCloud Storage Environments [J]. Computer Science, 2018, 45(11): 187-192.
[10] LI Zhen, ZHANG Zhuo and WANG Li-ming. Research on Text Classification Algorithm Based on Triadic Concept Analysis [J]. Computer Science, 2017, 44(8): 207-215.
[11] LIU Fei, JIANG De-jun, ZHANG Huan, CHEN Jing, WANG Jun and XIONG Jin. Heterogeneous Storage Aware Data Placement of Ceph Storage System [J]. Computer Science, 2017, 44(6): 17-22.
[12] FAN Jing, SHEN Jie and XIONG Li-rong. Scheduling Data Sensitive Workflow in Hybrid Cloud [J]. Computer Science, 2015, 42(Z11): 400-405.
[13] ZANG Ji-kun and YU Jian. Traffic Surveillance Video Storage in HDFS Based on Event Density [J]. Computer Science, 2015, 42(5): 221-224.
[14] WANG Zong-jiang, ZHENG Qiu-sheng and CAO Jian. Efficient Coordinator in Hybrid Cloud [J]. Computer Science, 2015, 42(1): 92-95.
[15] ZHANG Gui-gang. A kind of Big Data Placement Method [J]. Computer Science, 2014, 41(6): 1-4.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!