计算机科学 ›› 2023, Vol. 50 ›› Issue (4): 233-240.doi: 10.11896/jsjkx.220300215

• 计算机网络 • 上一篇    下一篇

基于改进DQN算法的容器集群自均衡调度策略

谢雍生1, 黄相恒1, 陈宁江1,2,3   

  1. 1 广西大学计算机与电子信息学院 南宁 530004
    2 广西智能数字服务工程技术研究中心 南宁 530004
    3 广西高校并行分布与智能计算重点实验室 南宁 530004
  • 收稿日期:2022-03-22 修回日期:2022-07-19 出版日期:2023-04-15 发布日期:2023-04-06
  • 通讯作者: 陈宁江(chnj@gxu.edu.cn)
  • 作者简介:(xys_gxu@163.com)
  • 基金资助:
    国家重点研发计划(2018YFB1404404),国家自然科学基金(62162003,61762008)

Self-balanced Scheduling Strategy for Container Cluster Based on Improved DQN Algorithm

XIE Yongsheng1, HUANG Xiangheng1, CHEN Ningjiang1,2,3   

  1. 1 School of Computer and Electronic Information,Guangxi University,Nanning 530004,China
    2 Guangxi Intelligent Digital Services Research Center of Engineering Technology,Nanning 530004,China
    3 Key Laboratory of Parallel, Distributed and Intelligent Computing(Guangxi University), Education Department of Guangxi Zhuang Autonomous Region,Nanning 530004,China
  • Received:2022-03-22 Revised:2022-07-19 Online:2023-04-15 Published:2023-04-06
  • About author:XIE Yongsheng,born in 1995,postgra-duate.His main research interests include cloud computing and intelligent software engineering.
    CHEN Ningjiang,born in 1975,Ph.D,professor,is a distinguished member of China Computer Federation.His main research interests include intelligent software engineering,big data,and cloudcomputing.
  • Supported by:
    National Key R&D Program of China(2018YFB1404404) and National Natural Science Foundation of China(62162003,61762008).

摘要: 容器云系统的资源调度策略对资源利用率和集群性能起着重要作用。现有的容器集群调度没有充分考虑节点内部和节点之间的资源占用情况,容易出现容器资源瓶颈,造成资源利用率低和服务可靠性差的问题。为了均衡容器集群的工作负载,减少容器资源瓶颈的出现,提出了一种基于DQN(Deep Q-learning Network)的容器集群调度优化算法CS-DQN(Container Scheduling Optimization Strategy Based on DQN)。首先提出一种面向负载均衡的容器集群资源利用率优化模型。然后利用深度强化学习方法,设计一种基于DQN的容器集群调度算法,定义相关的状态空间、动作空间和奖励函数。通过引入改进的DQN算法,基于自学习方法生成满足优化目标的容器动态调度策略。实验结果表明,该调度策略扩大了在调度中可部署容器的规模,在不同的工作负载中实现了较好的负载均衡,提高了资源利用率,更好地保证了服务可靠性。

关键词: 容器云, Deep Q-learning Network, 集群, 调度策略

Abstract: The resource scheduling strategy of container cloud system plays an important role in resource utilization and cluster performance.The existing container cluster scheduling does not fully take into account the resource occupancy within and between nodes,which is prone to container resource bottlenecks,resulting in low resource utilization and poor service reliability.In order to balance the workload of container cluster and reduce the bottleneck of container resources,this paper proposes a container cluster scheduling optimization algorithm CS-DQN(container scheduling optimization strategy based on DQN)based on deep Q-lear-ning network(DQN).Firstly,an optimization model of container cluster resource utilization for load balancing is proposed.Then,using the deep reinforcement learning method,a container cluster scheduling algorithm based on DQN is designed,and the relevant state space,action space and reward function are defined.By introducing the improved DQN algorithm,the container dynamic scheduling strategy which meets the optimization goal is generated based on the self-learning method.The prototype experimental results show that the scheduling strategy expands the scale of deployable containers in scheduling,achieves better load balancing in different workloads,improves resource utilization,and the service reliability is better guaranteed.

Key words: Container cloud, Deep Q-learning Network, Cluster, Scheduling strategy

中图分类号: 

  • TP391
[1]KONG D J,YAO X L.Kubernetes Resource Scheduling Strategy for 5G Edge Computing[J].Computer Engineering,2021,47(2):32-38.
[2]LIN M,XI J Q,BAI W H.Ant Colony Algorithm for Multi-Objective Optimization of Container-Basyd Microservice Scheduling in Cloud[J].IEEE Access.2019,7:83088-83100.
[3]CHEN X Y,XIAO S Y.Multi-Objective and Paralle-lParticleSwarm Optimization Algorithm for Container-Based Microservice Scheduling[J].Sensors,2021,21(18):6212.
[4]LV L,ZHANG Y C,LI Y S,et al.Communication-aware container placement and reassignment in large-scale internet data centers[J].IEEE Journal on Selected Areas in Communications,2019,37(3):540-555.
[5]ROSSI F,NARDELLI M,CARDELLINI V.Horizontal and vertical scaling of container-based applications using reinforcement learning[C]//Proceedings of IEEE 12th International Confe-rence on Cloud Computing(CLOU-D).2019:329-338.
[6]ZHOU M S,DONG X S,CHEN H,et al.Dynamically Finegrained Scheduling.Method in Cloud Environment[J].Journal of Sofware,2020,31(12):3981-3999.
[7]HU Y,ZHOU H,LAAT DE C,et al.Concurrent containerscheduling on heterogeneous clusters with multi-resource constraints[J].Future Generation Computer Systems,2020,102:562-573.
[8]RAUSCH T,RASHED A,DUSTDAR S.Optimized containerscheduling for data-intensive serverless edge computing[J].Future Generation Computer Systems,2021,114:259-271.
[9]LIU B,LI J W,LIN W W,et al.K-PSO:An improved PSO-based container scheduling algorithm for big data applications[J].International Journal Network Management,2021,31:e2092.
[10]XUE Y J,CHEN N J,XIE Y S.Container Cluster Scheduling Strategy Based on Delay Decision Under Multidimensional Constraints[C]//Proceedings of the 6th International Conference of Pioneering Computer Scientists,Engineers and Educators,Part I,(ICPCSEE 2020).2020:690-704.
[11]HU Y W,LEI Y G.A container cloud scheduling strategybased on QoS[C]//The 2nd International Conference on Computing and Data Science.2021:1-5.
[12]LIU Z Y,LV X D,JIANG C H.Application of Particle Swarm Optimization Algorithm Based on Simulated Annealing Algorithm in Container Scheduling[J].Computer Measurement & Control,2021,29(12):177-183.
[13]WANG Y,FU X,QIAO L,et al.Task Partitioning and Migration in Spacecraft Operating System Based on Cloud Computing[J].Aerospace Control and Application,2020,46(1):66-72.
[14]ZENG W,HU H C,LI L S,et al.Dynamic heterogeneous sche-duling method based on Stackelberg game model in container cloud[J].Chinese Journal of Network and Information Security,2021,7(3):95-104.
[15]XIE X L,WANG Q.A scheduling algorithm based on multi objective container cloud task[J].Journal of Shandong University(Engineering Science),2020,50(4):14-21.
[16]PIRES A,SIMAO J,VEIGA L.Distributed and Decent-ralized Orchestration of Containers on Edge Clouds[J].Journal of Grid Computing,2021,19(3):1-20.
[17]YAN C X,CHEN N J,LIU W B,et al.Elastic Supply Strategy of Container Resource for Mutation Load[J].Journal of Chinese Computer Systems,2019,40(4):787-792.
[18]YIN F,LONG L l,KONG Z,et al.Deployment method of do-ckers in cluster for dynamic workload[J].Journal of Computer Applications,2021,41(6):1581-1588.
[19]ZHANG S,WU T,PAN M,et al.A-SARSA:A Predictive Container Auto-Scaling Algorithm Based on Reinforcement Lear-ning[C]//Proceedings of 2020 IEEE International Conference on Web Services(ICWS).2020:489-497.
[20]LAN J L,ZHANG X S,HU Y X,et al.Software-defined networking QoS optimization based on deep reinforcement learning[J].Journal on Communications,2019,40(12):60-67.
[21]MNITH V,KAVUKCUOGLU K,SILVER D,et al.Playingatari with deep reinforcement learning[J].arXiv:1312.5602,2013.
[22]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529-533.
[23]ŞEN S Y,ÖZKURT N.Convolutional Neural Network Hyperparameter Tuning with Adam Optimizer for ECG Classification[C]//Proceedings of 2020 Innovations in Intelligent Systems and Applications Conference(ASYU).2020:1-6.
[24]ZHANG Z.Improved Adam Optimizer for Deep Neural Net-works[C]//Proceedings of 2018 IEEE/ACM 26th Interna-tional Symposium on Quality of Service(IWQoS).2018:1-2.
[1] 田真真, 蒋维, 郑炳旭, 孟利民.
基于服务器集群的负载均衡优化调度算法
Load Balancing Optimization Scheduling Algorithm Based on Server Cluster
计算机科学, 2022, 49(6A): 639-644. https://doi.org/10.11896/jsjkx.210800071
[2] 田冰川, 田臣, 周宇航, 陈贵海, 窦万春.
减少Hadoop集群中网络队头阻塞的调度算法
Reducing Head-of-Line Blocking on Network in Hadoop Clusters
计算机科学, 2022, 49(3): 11-22. https://doi.org/10.11896/jsjkx.210900117
[3] 杨林, 王永杰, 张俊.
FAWA:一种异构执行体的负反馈动态调度算法
FAWA:A Negative Feedback Dynamic Scheduling Algorithm for Heterogeneous Executor
计算机科学, 2021, 48(8): 284-290. https://doi.org/10.11896/jsjkx.200900059
[4] 郑增乾, 王锟, 赵涛, 蒋维, 孟利民.
带宽和时延受限的流媒体服务器集群负载均衡机制
Load Balancing Mechanism for Bandwidth and Time-delay Constrained Streaming Media Server Cluster
计算机科学, 2021, 48(6): 261-267. https://doi.org/10.11896/jsjkx.200400131
[5] 王宇晨, 齐文慧, 徐立臻.
基于区块链的无人机集群安全协作
Security Cooperation of UAV Swarm Based on Blockchain
计算机科学, 2021, 48(11A): 528-532. https://doi.org/10.11896/jsjkx.201100199
[6] 蒋化南, 张帅, 林宇斐, 李豪.
基于MPI的分布式并行Gazebo仿真优化与测试
Simulation Optimization and Testing Based on Gazebo of MPI Distributed Parallelism
计算机科学, 2021, 48(11A): 672-677. https://doi.org/10.11896/jsjkx.210100109
[7] 崔翔, 李晓雯, 陈一峯.
基于新型语言机制的异构集群应用通信优化方法
Communication Optimization Method of Heterogeneous Cluster Application Based on New Language Mechanism
计算机科学, 2020, 47(8): 17-15. https://doi.org/10.11896/jsjkx.200100124
[8] 刘世芳, 赵永华, 于天禹, 黄荣锋.
广义稠密对称特征问题标准化算法在GPU集群上的有效实现
Efficient Implementation of Generalized Dense Symmetric Eigenproblem StandardizationAlgorithm on GPU Cluster
计算机科学, 2020, 47(4): 6-12. https://doi.org/10.11896/jsjkx.191000009
[9] 刘丹.
基于雾计算和自评估的VANET聚类与协作感知
Fog Computing and Self-assessment Based Clustering and Cooperative Perception for VANET
计算机科学, 2020, 47(10): 55-62. https://doi.org/10.11896/jsjkx.200500154
[10] 王颖洁, 周宽久, 李明楚.
实时嵌入式系统的WCET分析与预测研究综述
Survey of WCET Analysis and Prediction for Real-time Embedded Systems
计算机科学, 2019, 46(6A): 16-22.
[11] 王卓昊, 杨冬菊, 徐晨阳.
基于ISE算法的分布式ETL任务调度策略研究
Research on Distributed ETL Tasks Scheduling Strategy Based on ISE Algorithm
计算机科学, 2019, 46(12): 1-7. https://doi.org/10.11896/jsjkx.190100023
[12] 胡雅鹏, 丁维龙, 王桂玲.
一种面向异构大数据计算框架的监控及调度服务
Monitoring and Dispatching Service for Heterogeneous Big Data Computing Frameworks
计算机科学, 2018, 45(6): 67-71. https://doi.org/10.11896/j.issn.1002-137X.2018.06.011
[13] 车建华,任守纲,余勇,徐焕良.
基于状态转移图的虚拟集群节点可用性分析
Availability Analyzing of Virtual Cluster Nodes Based on State Transition Diagram
计算机科学, 2018, 45(5): 317-321. https://doi.org/10.11896/j.issn.1002-137X.2018.05.055
[14] 施超,谢在鹏,柳晗,吕鑫.
基于稳定匹配的容器部署策略的优化
Optimization of Container Deployment Strategy Based on Stable Matching
计算机科学, 2018, 45(4): 131-136. https://doi.org/10.11896/j.issn.1002-137X.2018.04.021
[15] 厉柏伸,李领治,孙涌,朱艳琴.
基于伪梯度提升决策树的内网防御算法
Intranet Defense Algorithm Based on Pseudo Boosting Decision Tree
计算机科学, 2018, 45(4): 157-162. https://doi.org/10.11896/j.issn.1002-137X.2018.04.026
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!