计算机科学 ›› 2025, Vol. 52 ›› Issue (2): 310-322.doi: 10.11896/jsjkx.240500111

• 计算机网络 • 上一篇    下一篇

面向云数据中心基于改进A2C算法的任务调度策略

徐东红1, 李彬1, 齐勇2   

  1. 1 中国矿业大学计算机科学与技术学院 江苏 徐州 221116
    2 西安交通大学计算机科学与技术学院 西安 710049
  • 收稿日期:2024-05-26 修回日期:2024-08-06 出版日期:2025-02-15 发布日期:2025-02-17
  • 通讯作者: 李彬(ts22170014a31@cumt.edu.cn)
  • 作者简介:(xudh123@cumt.edu.cn)
  • 基金资助:
    国家自然科学基金(52374242);OPPO研究基金(H7D220060)

Task Scheduling Strategy Based on Improved A2C Algorithm for Cloud Data Center

XU Donghong1, LI Bin1, QI Yong2   

  1. 1 College of Computer Science and Technology,China University of Mining and Technology,Xuzhou,Jiangsu 221116,China
    2 College of Computer Science and Technology,Xi'an Jiaotong University,Xi'an 710049,China
  • Received:2024-05-26 Revised:2024-08-06 Online:2025-02-15 Published:2025-02-17
  • About author:XU Donghong,born in 1978,Ph.D,associate professor.His main research intere-sts include cloud computing,deep reinforcement learning and big models.
    LI Bin,born in 2000,postgraduate.His main research interests include cloud computing,deep reinforcement learning and task scheduling.
  • Supported by:
    National Natural Science Foundation of China(52374242) and OPPO Research Fund(H7D220060).

摘要: 已有基于深度强化学习(Deep Reinforcement Learning,DRL)的云数据中心任务调度算法存在有效经验利用率低造成训练成本高、状态空间维数不固定和维度较高导致学习震荡,以及策略更新步长固定造成的收敛速度慢等问题。为解决以上问题,基于云数据中心场景构建并行任务调度框架,并以时延、能耗和负载均衡为目标研究云任务调度问题。在DRL算法A2C(Advantage Actor Critic)的基础上,提出了一种基于自适应状态优选和动态步长的云数据中心任务调度算法(Adaptive state Optimization and Dynamic Step size A2C,AODS-A2C)。首先,使用准入控制和优先级策略对队列任务进行筛选和排序,提高有效经验的利用率;其次,将动态高维状态以自适应的方式进行快速优选处理,保持相对稳定的状态空间,避免训练过程中出现震荡问题;最后,使用JS(Jensen Shannon)散度度量新旧策略的概率分布差异,并根据这种差异动态地匹配调整Actor网络和Critic网络的学习步长,从而将当前学习状态迅速调整为最佳值,提高算法的收敛速度。仿真实验结果表明,所提出的AODS-A2C算法具有收敛速度快、鲁棒性高等特点,相较于其他对比算法在时延方面降低了1.2%到34.4%,在能耗方面降低了1.6%到57.2%,并可以实现良好的负载均衡。

关键词: 云计算, 任务调度, 深度强化学习, 状态优选, JS散度

Abstract: The existing task scheduling algorithms based on deep reinforcement learning(DRL) in cloud data center have the following problems.High training cost caused by low effective experience utilization,learning oscillation caused by variable and high dimension of state space,and slow convergence speed caused by fixed step size of policy update.In order to solve the above pro-blems,this paper constructs a parallel task scheduling framework based on the cloud data center scenario,and studies the cloud task scheduling problem with the goal of delay,energy consumption and load balancing.Based on DRL algorithm A2C,this paper proposes a task scheduling algorithm for cloud data center based on adaptive state optimization and dynamic step size A2C(AODS-A2C).Firstly,the admission control and priority strategy are used to filter and sort the queue tasks to improve the utilization of effective experience.Secondly,the dynamic high-dimensional state is quickly optimized in an adaptive way to maintain a relatively stable state space and avoid the oscillation problem in the training process.Finally,JS(Jensen Shannon) divergence is used to measure the probability distribution difference between the old and new strategies,and the learning step size of Actor network and Critic network is dynamically adjusted according to this difference,so as to quickly adjust to the best value for the current learning state and improve the convergence speed of the algorithm.The simulation results show that the proposed AOS-A2C algorithm has the characteristics of fast convergence speed and high robustness.Compared with other comparison algorithms,the delay is reduced by 1.2% to 34.4%,and the energy consumption is reduced by 1.6% to 57.2%,and it can achieve good load ba-lancing.

Key words: Cloud computing, Task scheduling, Deep reinforcement learning, State optimization, JS divergence

中图分类号: 

  • TP393
[1]PINEDO M L.Scheduling:Theory,algorithms,and systems(4th ed)[M].Berlin:Springer,2012:606-615.
[2]LIU Y K,ZHANG L,WANG L H,et al.A framework forscheduling in cloud manufacturing with deep reinforcement learning[C]//2019 IEEE 17th International Conference on Industrial Informatics(INDIN).IEEE,2019:1775-1780.
[3]CHOUDHARY A,RAJAK R.A novel strategy for deterministic workflow scheduling with load balancing using modified min-min heuristic in cloud computing environment[J].Cluster Computing,2024,27(5):6985-7006.
[4]YONG S,KUN S,JAMESON H,et al.Towards OptimizingTask Scheduling Process in Cloud Environment[C]//2021 IEEE 11th Annual Computing and Communication Workshop and Conference(CCWC).IEEE,2021:81-87.
[5]HE J G,LIU X L.Hybrid Teaching-Learning-Based Optimization for Workflow Scheduling in Cloud Environment[J].IEEE Access,2023,11:100755-100768.
[6]EBADIFARD F,BABAMIR S.Autonomic task scheduling algorithm for dynamic workloads through a load balancing technique for the cloud-computing environment[J].Cluster computing,2021,24(2):1075-1101.
[7]KUMAR K S,ANANDAMURUGAN S.An Energy and Deadline-Aware Scheduler with Hybrid Optimization in Virtualized Clouds[J].Journal of Electrical Engineering & Technology,2023,18(6):4415-4424.
[8]CAO E,MUSA S,CHEN M,et al.Energy and reliability-aware task scheduling for cost optimiization of DVFS-enabled cloud workflows[J].IEEE Transactions on Cloud Computing,2023,11(2):2127-2143.
[9]SINGH G,CHATURVEDI A K.Hybrid modified particleswarm optimization with genetic algorithm(GA) based workflow scheduling in cloud-fog environment for multi-objective optimization[J].Cluster Computing,2024,27(2):1947-1964.
[10]IYAPPAN P,JAMUNA P.Hybrid Simulated Annealing andSpotted Hyena Optimization Algorithm-Based Resource Ma-nagement and Scheduling in Cloud Environment[J].Wireless Personal Communications,2023,133(2):1123-1147.
[11]KHAN A R.Dynamic Load Balancing in Cloud Computing:Optimized RL-Based Clustering with Multi-Objective Optimized Task Scheduling[J].Processes.2024,12(3):519-533.
[12]UMA J,VIVEKANANDAN P,SHANKAR S,et al.Optimized intellectual resource scheduling using deep reinforcement Q-learning in cloud computing[J].Transactions on Emerging Telecommunications Technologies,2022(5):33.
[13]ISLAM M T,KARUNASEKERA S,BUYYA R,et al.Perfor-mance and Cost Efficient Spark Job Scheduling Based on Deep Reinforcement Learning in Cloud Computing Environments[J].IEEE Transactions on Parallel and Distributed Systems:A Publication of the IEEE Computer Society,2022,33(7):1695-1710.
[14]DU Y,ZHANG S,CHENG P,et al.Remote Sensing Data Processing Process Scheduling Based on Reinforcement Learning in Cloud Environment[J].CMES-Computer Modeling in Enginee-ring & Sciences,2023,135(3):1965-1979.
[15]YAO X,CHEN N,YUAN X,et al.Performance optimization of serverless edge computing function offloading based on deep reinforcement learning[J].Future generations computer systems:FGCS,2023,139:74-86.
[16]CHRAIBI A,SAID B A,ABDELLAH T,et al.A novel dynamic multi-objective task scheduling optimization based on Dueling DQN and PER.The Journal of Supercomputing[J].2023,79(18):21368-21423.
[17]PENG Z,LIN J,CUI D,et al.A multi-objective trade-off framework for cloud resource scheduling based on the deep Q-network algorithm[J].Cluster Computing.2020,23:2753-2767.
[18]HAO Y Y,CHUN L,ZHAO Z L,et al.A learning and evolution-based intelligence algorithm for multi-objective heteroge-neous cloud scheduling optimization[J].Knowledge-Based Systems,2024,286:111366.
[19]LI T,SHI Y,ZHAO Y S,et al.Batch Jobs Load BalancingScheduling in Cloud Computing Using Distributional Reinforcement Learning[J].IEEE Transactions on Parallel and Distributed Systems.2023,35(1):169-185.
[20]AHLUWALIA J K,MOURADIAN C,ALAM M,et al.A Cloud Infrastructure as a Service for an Efficient Usage of Sensing and Actuation Capabilities in Internet of Things[C]//IEEE/IFIP Network Operations and Management Symposium(NOMS).2022:25-29.
[21]XU S,LI Y,GUO S,et al.Cloud-Edge Collaborative SFC Mapping for Industrial IoT Using Deep Reinforcement Learning[J].IEEE transactions on industrial informatics,2022,18(6):4158-4168.
[22]KANSAL A,ZHAO F,LIU J,et al.A Virtual machine power metering and provisioning[C]//Proceedings of the 1st ACM symposium on Cloud computing(SOCC).ACM,2010:39-50.
[23]CHEN J J,SHI J,GEORG V D B,et al.Scheduling of Real-Time Tasks With Multiple Critical Sections in Multiprocessor Systems[J].IEEE Transactions on Computers,2022,71(1):146-160.
[24]YE Y B,HU X X,QIAN H Y.A HPEDF-Based Reinforcement Learning Radar Scheduling Method[C]//International Confe-rence on Autonomous Unmanned Systems(ICAUS).2022,1010:1160-1168.
[25]ZHAO Z,FAN L,HAN Z.Hybrid Quantum Benders' Decomposition For Mixed-integer Linear Programming[C]//IEEE Wireless Communications and Networking Conference(WC-NC).IEEE,2022:2536-2540.
[26]LIU D,HAN H,SHEN F.Dialogue Policy Optimization Based on KL-GAN-A2C Model[C]//2019 16th International Compu-ter Conference on Wavelet Active Media Technology and Information Processing.2020:417-420.
[27]TOSATTO S,CARVALHO J,PETERS J.Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,44(10):5996-6010.
[28]GUO R,XUE X,SUN A,et al.Clustered Energy Management Strategy of Plug-In Hybrid Electric Logistics Vehicle Based on Gaussian Mixture Model and Stochastic Dynamic Programming[J].IEEE Transactions on Transportation Electrification,2023,9(2):3177-3191.
[29]WANG S Y,YUEN C,NI W,et al.Multiagent Deep Reinforcement Learning for Cost and Delay Sensitive Virtual Network Function Placement and Routing[J].IEEE Transactions on Communications,2022,70(8):5208-5224.
[30]Alibaba.Cluster-trace-v2017[EB/OL].(2017-08-30)[2024-01-25].https://github.com/alibaba/clusterdata/blob/master/clu-ster-trace-v2017.
[31]JIANG H,NI T.PB-FCFS-a task scheduling algorithm based on FCFS and backfilling strategy for grid computing[C]//2009 Joint Conferences on Pervasive Computing(JCPC).IEEE,2009:507-510.
[32]PENG Z,CUI D,ZUO J,et al.Random task scheduling scheme based on reinforcement learning in cloud computing[J].Cluster computing,2015,18:1595-1607.
[33]AlWORAFI M A,DHARI A,HASHMI A A,et al.An im-proved SJF scheduling algorithm in cloud computing environment[C]//2016 International Conference on Electrical,Electronics Communication Computer and Optimization Techniques(ICEECCOT).IEEE,2016:208-212.
[34]KUMAR M,SHAMAR S C.Priority Aware Longest Job First(PA-LJF) algorithm for utilization of the resource in cloud environment[C]//2016 3rd International Conference on Computing for Sustainable Global Development(INDIACom).IEEE,2016:415-420.
[35]LING X,YUAN Y,WANG D,et al.Tetris:Optimizing cloud resource usage unbalance with elastic VM[C]//2016 IEEE/ACM 24th International Symposium on Quality of Service(IWQoS).IEEE,2016:1-10.
[36]YANG X W,CUI Y H,QIAN Q,et al,COURIER:A non-preemptive priority queuing and prioritized experience replay DRL approach to scheduling and offloading Tasks in edge computing[J].Computer Science,2024,51(5):293-305.
[37]YAO L,XU X,BILAL M,et al.Dynamic edge computation offloading for internet of vehicles with deep reinforcement learning[J].IEEE Transactions on Intelligent Transportation Systems,2022,24(11):12991-12999.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!