计算机科学 ›› 2025, Vol. 52 ›› Issue (2): 310-322.doi: 10.11896/jsjkx.240500111
徐东红1, 李彬1, 齐勇2
XU Donghong1, LI Bin1, QI Yong2
摘要: 已有基于深度强化学习(Deep Reinforcement Learning,DRL)的云数据中心任务调度算法存在有效经验利用率低造成训练成本高、状态空间维数不固定和维度较高导致学习震荡,以及策略更新步长固定造成的收敛速度慢等问题。为解决以上问题,基于云数据中心场景构建并行任务调度框架,并以时延、能耗和负载均衡为目标研究云任务调度问题。在DRL算法A2C(Advantage Actor Critic)的基础上,提出了一种基于自适应状态优选和动态步长的云数据中心任务调度算法(Adaptive state Optimization and Dynamic Step size A2C,AODS-A2C)。首先,使用准入控制和优先级策略对队列任务进行筛选和排序,提高有效经验的利用率;其次,将动态高维状态以自适应的方式进行快速优选处理,保持相对稳定的状态空间,避免训练过程中出现震荡问题;最后,使用JS(Jensen Shannon)散度度量新旧策略的概率分布差异,并根据这种差异动态地匹配调整Actor网络和Critic网络的学习步长,从而将当前学习状态迅速调整为最佳值,提高算法的收敛速度。仿真实验结果表明,所提出的AODS-A2C算法具有收敛速度快、鲁棒性高等特点,相较于其他对比算法在时延方面降低了1.2%到34.4%,在能耗方面降低了1.6%到57.2%,并可以实现良好的负载均衡。
中图分类号:
[1]PINEDO M L.Scheduling:Theory,algorithms,and systems(4th ed)[M].Berlin:Springer,2012:606-615. [2]LIU Y K,ZHANG L,WANG L H,et al.A framework forscheduling in cloud manufacturing with deep reinforcement learning[C]//2019 IEEE 17th International Conference on Industrial Informatics(INDIN).IEEE,2019:1775-1780. [3]CHOUDHARY A,RAJAK R.A novel strategy for deterministic workflow scheduling with load balancing using modified min-min heuristic in cloud computing environment[J].Cluster Computing,2024,27(5):6985-7006. [4]YONG S,KUN S,JAMESON H,et al.Towards OptimizingTask Scheduling Process in Cloud Environment[C]//2021 IEEE 11th Annual Computing and Communication Workshop and Conference(CCWC).IEEE,2021:81-87. [5]HE J G,LIU X L.Hybrid Teaching-Learning-Based Optimization for Workflow Scheduling in Cloud Environment[J].IEEE Access,2023,11:100755-100768. [6]EBADIFARD F,BABAMIR S.Autonomic task scheduling algorithm for dynamic workloads through a load balancing technique for the cloud-computing environment[J].Cluster computing,2021,24(2):1075-1101. [7]KUMAR K S,ANANDAMURUGAN S.An Energy and Deadline-Aware Scheduler with Hybrid Optimization in Virtualized Clouds[J].Journal of Electrical Engineering & Technology,2023,18(6):4415-4424. [8]CAO E,MUSA S,CHEN M,et al.Energy and reliability-aware task scheduling for cost optimiization of DVFS-enabled cloud workflows[J].IEEE Transactions on Cloud Computing,2023,11(2):2127-2143. [9]SINGH G,CHATURVEDI A K.Hybrid modified particleswarm optimization with genetic algorithm(GA) based workflow scheduling in cloud-fog environment for multi-objective optimization[J].Cluster Computing,2024,27(2):1947-1964. [10]IYAPPAN P,JAMUNA P.Hybrid Simulated Annealing andSpotted Hyena Optimization Algorithm-Based Resource Ma-nagement and Scheduling in Cloud Environment[J].Wireless Personal Communications,2023,133(2):1123-1147. [11]KHAN A R.Dynamic Load Balancing in Cloud Computing:Optimized RL-Based Clustering with Multi-Objective Optimized Task Scheduling[J].Processes.2024,12(3):519-533. [12]UMA J,VIVEKANANDAN P,SHANKAR S,et al.Optimized intellectual resource scheduling using deep reinforcement Q-learning in cloud computing[J].Transactions on Emerging Telecommunications Technologies,2022(5):33. [13]ISLAM M T,KARUNASEKERA S,BUYYA R,et al.Perfor-mance and Cost Efficient Spark Job Scheduling Based on Deep Reinforcement Learning in Cloud Computing Environments[J].IEEE Transactions on Parallel and Distributed Systems:A Publication of the IEEE Computer Society,2022,33(7):1695-1710. [14]DU Y,ZHANG S,CHENG P,et al.Remote Sensing Data Processing Process Scheduling Based on Reinforcement Learning in Cloud Environment[J].CMES-Computer Modeling in Enginee-ring & Sciences,2023,135(3):1965-1979. [15]YAO X,CHEN N,YUAN X,et al.Performance optimization of serverless edge computing function offloading based on deep reinforcement learning[J].Future generations computer systems:FGCS,2023,139:74-86. [16]CHRAIBI A,SAID B A,ABDELLAH T,et al.A novel dynamic multi-objective task scheduling optimization based on Dueling DQN and PER.The Journal of Supercomputing[J].2023,79(18):21368-21423. [17]PENG Z,LIN J,CUI D,et al.A multi-objective trade-off framework for cloud resource scheduling based on the deep Q-network algorithm[J].Cluster Computing.2020,23:2753-2767. [18]HAO Y Y,CHUN L,ZHAO Z L,et al.A learning and evolution-based intelligence algorithm for multi-objective heteroge-neous cloud scheduling optimization[J].Knowledge-Based Systems,2024,286:111366. [19]LI T,SHI Y,ZHAO Y S,et al.Batch Jobs Load BalancingScheduling in Cloud Computing Using Distributional Reinforcement Learning[J].IEEE Transactions on Parallel and Distributed Systems.2023,35(1):169-185. [20]AHLUWALIA J K,MOURADIAN C,ALAM M,et al.A Cloud Infrastructure as a Service for an Efficient Usage of Sensing and Actuation Capabilities in Internet of Things[C]//IEEE/IFIP Network Operations and Management Symposium(NOMS).2022:25-29. [21]XU S,LI Y,GUO S,et al.Cloud-Edge Collaborative SFC Mapping for Industrial IoT Using Deep Reinforcement Learning[J].IEEE transactions on industrial informatics,2022,18(6):4158-4168. [22]KANSAL A,ZHAO F,LIU J,et al.A Virtual machine power metering and provisioning[C]//Proceedings of the 1st ACM symposium on Cloud computing(SOCC).ACM,2010:39-50. [23]CHEN J J,SHI J,GEORG V D B,et al.Scheduling of Real-Time Tasks With Multiple Critical Sections in Multiprocessor Systems[J].IEEE Transactions on Computers,2022,71(1):146-160. [24]YE Y B,HU X X,QIAN H Y.A HPEDF-Based Reinforcement Learning Radar Scheduling Method[C]//International Confe-rence on Autonomous Unmanned Systems(ICAUS).2022,1010:1160-1168. [25]ZHAO Z,FAN L,HAN Z.Hybrid Quantum Benders' Decomposition For Mixed-integer Linear Programming[C]//IEEE Wireless Communications and Networking Conference(WC-NC).IEEE,2022:2536-2540. [26]LIU D,HAN H,SHEN F.Dialogue Policy Optimization Based on KL-GAN-A2C Model[C]//2019 16th International Compu-ter Conference on Wavelet Active Media Technology and Information Processing.2020:417-420. [27]TOSATTO S,CARVALHO J,PETERS J.Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,44(10):5996-6010. [28]GUO R,XUE X,SUN A,et al.Clustered Energy Management Strategy of Plug-In Hybrid Electric Logistics Vehicle Based on Gaussian Mixture Model and Stochastic Dynamic Programming[J].IEEE Transactions on Transportation Electrification,2023,9(2):3177-3191. [29]WANG S Y,YUEN C,NI W,et al.Multiagent Deep Reinforcement Learning for Cost and Delay Sensitive Virtual Network Function Placement and Routing[J].IEEE Transactions on Communications,2022,70(8):5208-5224. [30]Alibaba.Cluster-trace-v2017[EB/OL].(2017-08-30)[2024-01-25].https://github.com/alibaba/clusterdata/blob/master/clu-ster-trace-v2017. [31]JIANG H,NI T.PB-FCFS-a task scheduling algorithm based on FCFS and backfilling strategy for grid computing[C]//2009 Joint Conferences on Pervasive Computing(JCPC).IEEE,2009:507-510. [32]PENG Z,CUI D,ZUO J,et al.Random task scheduling scheme based on reinforcement learning in cloud computing[J].Cluster computing,2015,18:1595-1607. [33]AlWORAFI M A,DHARI A,HASHMI A A,et al.An im-proved SJF scheduling algorithm in cloud computing environment[C]//2016 International Conference on Electrical,Electronics Communication Computer and Optimization Techniques(ICEECCOT).IEEE,2016:208-212. [34]KUMAR M,SHAMAR S C.Priority Aware Longest Job First(PA-LJF) algorithm for utilization of the resource in cloud environment[C]//2016 3rd International Conference on Computing for Sustainable Global Development(INDIACom).IEEE,2016:415-420. [35]LING X,YUAN Y,WANG D,et al.Tetris:Optimizing cloud resource usage unbalance with elastic VM[C]//2016 IEEE/ACM 24th International Symposium on Quality of Service(IWQoS).IEEE,2016:1-10. [36]YANG X W,CUI Y H,QIAN Q,et al,COURIER:A non-preemptive priority queuing and prioritized experience replay DRL approach to scheduling and offloading Tasks in edge computing[J].Computer Science,2024,51(5):293-305. [37]YAO L,XU X,BILAL M,et al.Dynamic edge computation offloading for internet of vehicles with deep reinforcement learning[J].IEEE Transactions on Intelligent Transportation Systems,2022,24(11):12991-12999. |
|