Computer Science ›› 2025, Vol. 52 ›› Issue (2): 310-322.doi: 10.11896/jsjkx.240500111

• Computer Network • Previous Articles     Next Articles

Task Scheduling Strategy Based on Improved A2C Algorithm for Cloud Data Center

XU Donghong1, LI Bin1, QI Yong2   

  1. 1 College of Computer Science and Technology,China University of Mining and Technology,Xuzhou,Jiangsu 221116,China
    2 College of Computer Science and Technology,Xi'an Jiaotong University,Xi'an 710049,China
  • Received:2024-05-26 Revised:2024-08-06 Online:2025-02-15 Published:2025-02-17
  • About author:XU Donghong,born in 1978,Ph.D,associate professor.His main research intere-sts include cloud computing,deep reinforcement learning and big models.
    LI Bin,born in 2000,postgraduate.His main research interests include cloud computing,deep reinforcement learning and task scheduling.
  • Supported by:
    National Natural Science Foundation of China(52374242) and OPPO Research Fund(H7D220060).

Abstract: The existing task scheduling algorithms based on deep reinforcement learning(DRL) in cloud data center have the following problems.High training cost caused by low effective experience utilization,learning oscillation caused by variable and high dimension of state space,and slow convergence speed caused by fixed step size of policy update.In order to solve the above pro-blems,this paper constructs a parallel task scheduling framework based on the cloud data center scenario,and studies the cloud task scheduling problem with the goal of delay,energy consumption and load balancing.Based on DRL algorithm A2C,this paper proposes a task scheduling algorithm for cloud data center based on adaptive state optimization and dynamic step size A2C(AODS-A2C).Firstly,the admission control and priority strategy are used to filter and sort the queue tasks to improve the utilization of effective experience.Secondly,the dynamic high-dimensional state is quickly optimized in an adaptive way to maintain a relatively stable state space and avoid the oscillation problem in the training process.Finally,JS(Jensen Shannon) divergence is used to measure the probability distribution difference between the old and new strategies,and the learning step size of Actor network and Critic network is dynamically adjusted according to this difference,so as to quickly adjust to the best value for the current learning state and improve the convergence speed of the algorithm.The simulation results show that the proposed AOS-A2C algorithm has the characteristics of fast convergence speed and high robustness.Compared with other comparison algorithms,the delay is reduced by 1.2% to 34.4%,and the energy consumption is reduced by 1.6% to 57.2%,and it can achieve good load ba-lancing.

Key words: Cloud computing, Task scheduling, Deep reinforcement learning, State optimization, JS divergence

CLC Number: 

  • TP393
[1]PINEDO M L.Scheduling:Theory,algorithms,and systems(4th ed)[M].Berlin:Springer,2012:606-615.
[2]LIU Y K,ZHANG L,WANG L H,et al.A framework forscheduling in cloud manufacturing with deep reinforcement learning[C]//2019 IEEE 17th International Conference on Industrial Informatics(INDIN).IEEE,2019:1775-1780.
[3]CHOUDHARY A,RAJAK R.A novel strategy for deterministic workflow scheduling with load balancing using modified min-min heuristic in cloud computing environment[J].Cluster Computing,2024,27(5):6985-7006.
[4]YONG S,KUN S,JAMESON H,et al.Towards OptimizingTask Scheduling Process in Cloud Environment[C]//2021 IEEE 11th Annual Computing and Communication Workshop and Conference(CCWC).IEEE,2021:81-87.
[5]HE J G,LIU X L.Hybrid Teaching-Learning-Based Optimization for Workflow Scheduling in Cloud Environment[J].IEEE Access,2023,11:100755-100768.
[6]EBADIFARD F,BABAMIR S.Autonomic task scheduling algorithm for dynamic workloads through a load balancing technique for the cloud-computing environment[J].Cluster computing,2021,24(2):1075-1101.
[7]KUMAR K S,ANANDAMURUGAN S.An Energy and Deadline-Aware Scheduler with Hybrid Optimization in Virtualized Clouds[J].Journal of Electrical Engineering & Technology,2023,18(6):4415-4424.
[8]CAO E,MUSA S,CHEN M,et al.Energy and reliability-aware task scheduling for cost optimiization of DVFS-enabled cloud workflows[J].IEEE Transactions on Cloud Computing,2023,11(2):2127-2143.
[9]SINGH G,CHATURVEDI A K.Hybrid modified particleswarm optimization with genetic algorithm(GA) based workflow scheduling in cloud-fog environment for multi-objective optimization[J].Cluster Computing,2024,27(2):1947-1964.
[10]IYAPPAN P,JAMUNA P.Hybrid Simulated Annealing andSpotted Hyena Optimization Algorithm-Based Resource Ma-nagement and Scheduling in Cloud Environment[J].Wireless Personal Communications,2023,133(2):1123-1147.
[11]KHAN A R.Dynamic Load Balancing in Cloud Computing:Optimized RL-Based Clustering with Multi-Objective Optimized Task Scheduling[J].Processes.2024,12(3):519-533.
[12]UMA J,VIVEKANANDAN P,SHANKAR S,et al.Optimized intellectual resource scheduling using deep reinforcement Q-learning in cloud computing[J].Transactions on Emerging Telecommunications Technologies,2022(5):33.
[13]ISLAM M T,KARUNASEKERA S,BUYYA R,et al.Perfor-mance and Cost Efficient Spark Job Scheduling Based on Deep Reinforcement Learning in Cloud Computing Environments[J].IEEE Transactions on Parallel and Distributed Systems:A Publication of the IEEE Computer Society,2022,33(7):1695-1710.
[14]DU Y,ZHANG S,CHENG P,et al.Remote Sensing Data Processing Process Scheduling Based on Reinforcement Learning in Cloud Environment[J].CMES-Computer Modeling in Enginee-ring & Sciences,2023,135(3):1965-1979.
[15]YAO X,CHEN N,YUAN X,et al.Performance optimization of serverless edge computing function offloading based on deep reinforcement learning[J].Future generations computer systems:FGCS,2023,139:74-86.
[16]CHRAIBI A,SAID B A,ABDELLAH T,et al.A novel dynamic multi-objective task scheduling optimization based on Dueling DQN and PER.The Journal of Supercomputing[J].2023,79(18):21368-21423.
[17]PENG Z,LIN J,CUI D,et al.A multi-objective trade-off framework for cloud resource scheduling based on the deep Q-network algorithm[J].Cluster Computing.2020,23:2753-2767.
[18]HAO Y Y,CHUN L,ZHAO Z L,et al.A learning and evolution-based intelligence algorithm for multi-objective heteroge-neous cloud scheduling optimization[J].Knowledge-Based Systems,2024,286:111366.
[19]LI T,SHI Y,ZHAO Y S,et al.Batch Jobs Load BalancingScheduling in Cloud Computing Using Distributional Reinforcement Learning[J].IEEE Transactions on Parallel and Distributed Systems.2023,35(1):169-185.
[20]AHLUWALIA J K,MOURADIAN C,ALAM M,et al.A Cloud Infrastructure as a Service for an Efficient Usage of Sensing and Actuation Capabilities in Internet of Things[C]//IEEE/IFIP Network Operations and Management Symposium(NOMS).2022:25-29.
[21]XU S,LI Y,GUO S,et al.Cloud-Edge Collaborative SFC Mapping for Industrial IoT Using Deep Reinforcement Learning[J].IEEE transactions on industrial informatics,2022,18(6):4158-4168.
[22]KANSAL A,ZHAO F,LIU J,et al.A Virtual machine power metering and provisioning[C]//Proceedings of the 1st ACM symposium on Cloud computing(SOCC).ACM,2010:39-50.
[23]CHEN J J,SHI J,GEORG V D B,et al.Scheduling of Real-Time Tasks With Multiple Critical Sections in Multiprocessor Systems[J].IEEE Transactions on Computers,2022,71(1):146-160.
[24]YE Y B,HU X X,QIAN H Y.A HPEDF-Based Reinforcement Learning Radar Scheduling Method[C]//International Confe-rence on Autonomous Unmanned Systems(ICAUS).2022,1010:1160-1168.
[25]ZHAO Z,FAN L,HAN Z.Hybrid Quantum Benders' Decomposition For Mixed-integer Linear Programming[C]//IEEE Wireless Communications and Networking Conference(WC-NC).IEEE,2022:2536-2540.
[26]LIU D,HAN H,SHEN F.Dialogue Policy Optimization Based on KL-GAN-A2C Model[C]//2019 16th International Compu-ter Conference on Wavelet Active Media Technology and Information Processing.2020:417-420.
[27]TOSATTO S,CARVALHO J,PETERS J.Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,44(10):5996-6010.
[28]GUO R,XUE X,SUN A,et al.Clustered Energy Management Strategy of Plug-In Hybrid Electric Logistics Vehicle Based on Gaussian Mixture Model and Stochastic Dynamic Programming[J].IEEE Transactions on Transportation Electrification,2023,9(2):3177-3191.
[29]WANG S Y,YUEN C,NI W,et al.Multiagent Deep Reinforcement Learning for Cost and Delay Sensitive Virtual Network Function Placement and Routing[J].IEEE Transactions on Communications,2022,70(8):5208-5224.
[30]Alibaba.Cluster-trace-v2017[EB/OL].(2017-08-30)[2024-01-25].https://github.com/alibaba/clusterdata/blob/master/clu-ster-trace-v2017.
[31]JIANG H,NI T.PB-FCFS-a task scheduling algorithm based on FCFS and backfilling strategy for grid computing[C]//2009 Joint Conferences on Pervasive Computing(JCPC).IEEE,2009:507-510.
[32]PENG Z,CUI D,ZUO J,et al.Random task scheduling scheme based on reinforcement learning in cloud computing[J].Cluster computing,2015,18:1595-1607.
[33]AlWORAFI M A,DHARI A,HASHMI A A,et al.An im-proved SJF scheduling algorithm in cloud computing environment[C]//2016 International Conference on Electrical,Electronics Communication Computer and Optimization Techniques(ICEECCOT).IEEE,2016:208-212.
[34]KUMAR M,SHAMAR S C.Priority Aware Longest Job First(PA-LJF) algorithm for utilization of the resource in cloud environment[C]//2016 3rd International Conference on Computing for Sustainable Global Development(INDIACom).IEEE,2016:415-420.
[35]LING X,YUAN Y,WANG D,et al.Tetris:Optimizing cloud resource usage unbalance with elastic VM[C]//2016 IEEE/ACM 24th International Symposium on Quality of Service(IWQoS).IEEE,2016:1-10.
[36]YANG X W,CUI Y H,QIAN Q,et al,COURIER:A non-preemptive priority queuing and prioritized experience replay DRL approach to scheduling and offloading Tasks in edge computing[J].Computer Science,2024,51(5):293-305.
[37]YAO L,XU X,BILAL M,et al.Dynamic edge computation offloading for internet of vehicles with deep reinforcement learning[J].IEEE Transactions on Intelligent Transportation Systems,2022,24(11):12991-12999.
[1] ZHOU Danying, HUANG Tianhao, LIU Ruming. Research and Practice on Key Technologies for Serverless Computing [J]. Computer Science, 2025, 52(6A): 240700114-6.
[2] WU Zongming, CAO Jijun, TANG Qiang. Online Parallel SDN Routing Optimization Algorithm Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(6A): 240900018-9.
[3] WANG Chenyuan, ZHANG Yanmei, YUAN Guan. Class Integration Test Order Generation Approach Fused with Deep Reinforcement Learning andGraph Convolutional Neural Network [J]. Computer Science, 2025, 52(6): 58-65.
[4] ZHAO Xuejian, YE Hao, LI Hao, SUN Zhixin. Multi-AGV Path Planning Algorithm Based on Improved DDPG [J]. Computer Science, 2025, 52(6): 306-315.
[5] ZHOU Kai, WANG Kai, ZHU Yuhang, PU Liming, LIU Shuxin, ZHOU Deqiang. Customized Container Scheduling Strategy Based on GMM [J]. Computer Science, 2025, 52(6): 346-354.
[6] TAN Shiyi, WANG Huaqun. Remote Dynamic Data Integrity Checking Scheme for Multi-cloud and Multi-replica [J]. Computer Science, 2025, 52(5): 345-356.
[7] LI Yuanbo, HU Hongchao, YANG Xiaohan, GUO Wei, LIU Wenyan. Intrusion Tolerance Scheduling Algorithm for Microservice Workflow Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(5): 375-383.
[8] ZHENG Longhai, XIAO Bohuai, YAO Zewei, CHEN Xing, MO Yuchang. Graph Reinforcement Learning Based Multi-edge Cooperative Load Balancing Method [J]. Computer Science, 2025, 52(3): 338-348.
[9] DU Likuan, LIU Chen, WANG Junlu, SONG Baoyan. Self-learning Star Chain Space Adaptive Allocation Method [J]. Computer Science, 2025, 52(3): 359-365.
[10] HUO Xingpeng, SHA Letian, LIU Jianwen, WU Shang, SU Ziyue. Windows Domain Penetration Testing Attack Path Generation Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(3): 400-406.
[11] YANG Chen, XIAO Jing, WANG Mi. Task Scheduling in Heterogeneous Server Systems Based on Data Splitting and Energy-aware Strategies [J]. Computer Science, 2025, 52(2): 291-298.
[12] WANG Tianjiu, LIU Quan, WU Lan. Offline Reinforcement Learning Algorithm for Conservative Q-learning Based on Uncertainty Weight [J]. Computer Science, 2024, 51(9): 265-272.
[13] ZHOU Wenhui, PENG Qinghua, XIE Lei. Study on Adaptive Cloud-Edge Collaborative Scheduling Methods for Multi-object State Perception [J]. Computer Science, 2024, 51(9): 319-330.
[14] LI Zhi, LIN Sen, ZHANG Qiang. Edge Cloud Computing Approach for Intelligent Fault Detection in Rail Transit [J]. Computer Science, 2024, 51(9): 331-337.
[15] LI Danyang, WU Liangji, LIU Hui, JIANG Jingqing. Deep Reinforcement Learning Based Thermal Awareness Energy Consumption OptimizationMethod for Data Centers [J]. Computer Science, 2024, 51(6A): 230500109-8.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!