计算机科学 ›› 2023, Vol. 50 ›› Issue (6): 36-44.doi: 10.11896/jsjkx.220300192

• 高性能计算 • 上一篇    下一篇

基于决策树和由均匀分布改进Q学习的虚拟机整合算法

师亮1,2, 温亮明1,2, 雷声1,2, 黎建辉1   

  1. 1 中国科学院计算机网络信息中心 北京 100090
    2 中国科学院大学 北京 100049
  • 收稿日期:2022-03-21 修回日期:2022-09-22 出版日期:2023-06-15 发布日期:2023-06-06
  • 通讯作者: 黎建辉(lijh@cnic.cn)
  • 作者简介:(shiliang6402@foxmail.com)
  • 基金资助:
    国家重点研发计划(2021YFE0111500);中国科学院国际大科学计划培育专项(241711KYSB20200023)

Virtual Machine Consolidation Algorithm Based on Decision Tree and Improved Q-learning by Uniform Distribution

SHI Liang1,2, WEN Liangming1,2, LEI Sheng1,2, LI Jianhui1   

  1. 1 Computer Network Information Center,Chinese Academy of Sciences,Beijing 100090,China
    2 University of Chinese Academy of Sciences,Beijing 100049,China
  • Received:2022-03-21 Revised:2022-09-22 Online:2023-06-15 Published:2023-06-06
  • About author:SHI Liang,born in 1996,postgraduate.His main research interests include cloud resource scheduling and reinforcement learning.LI Jianhui,born in 1973,Ph.D,professor,Ph.D supervisor.His main research interests include cloud computing,distributed systems,and artificial intelligence for IT operations.
  • Supported by:
    National Key R & D Program of China(2021YFE0111500) and International Mega-science Programs of the Chinese Academy of Sciences(241711KYSB20200023).

摘要: 随着云数据中心规模的不断扩大,次优虚拟机整合算法所引起的高能耗、低资源利用率和用户服务质量下降等问题逐渐凸显。为此,提出了一种基于决策树和由均匀分布改进Q学习的虚拟机整合算法(DTQL-UD)。该算法采用决策树实现状态表征,并在评估下一时刻状态-动作价值时采用均匀分布选取下一时刻动作,可直接从云数据中心状态到虚拟机迁移的过程中通过实时反馈来不断优化决策。此外,针对强化学习中模拟器与真实场景中的差异问题,基于大量真实云数据中心负载跟踪数据,使用监督学习模型训练模拟器以增加模拟器的仿真度。仿真实验结果表明,DTQL-UD在能耗、资源利用率、用户服务质量、虚拟机迁移次数和剩余活跃主机数量方面分别优化了14%,12%,21%,40%和10%。同时,得益于决策树在表格型数据上更强的特征提取能力,DTQL-UD相比其他现有的深度强化学习方法可学到更优的整合策略,并且在本实验中随着云数据中心规模的增大,可将传统强化学习模型的训练耗时逐步减少60%~92%。

关键词: 云资源调度, 虚拟机整合算法, 强化学习, 决策树

Abstract: As the scale of cloud data centers expands,problems such as high energy consumption,low resource utilization,and reduced quality of service caused by sub-optimal virtual machine consolidation algorithm becomes increasingly prominent.Therefore,this paper proposes DTQL-UD,a virtual machine consolidation algorithm based on decision tree and improved Q-learning by uniform distribution.It uses the decision tree to characterize the states and selects the next action by uniform distribution when evaluating the next state-action value.At the same time,it can optimize decision-making with real-time feedback directly from the state of the cloud data center to the virtual machine migration process.Besides,aiming at the difference between the simulator and real world in reinforcement learning,we train the simulator by supervised learning model based on a large amount of real cluster load tracking data to enhance the degree of the simulator.Compared with the existing heuristic methods,experiment results show that DTQL-UD can optimize energy consumption,resource utilization,quality of service,number of virtual machine migrations,and remaining active hosts,by 14%,12%,21%,40%,and 10%,respectively.Meanwhile,due to the stronger feature extraction capability of decision tree on tabular data,DTQL-UD can learn better scheduling strategy than other existing deep reinforcement learning(DRL)methods.And in our experiments,as the cluster size increases,the proposed algorithm can gradually reduce the training time of traditional reinforcement learning models by 60% to 92%.

Key words: Cloud resource scheduling, Virtual machine consolidation algorithm, Reinforcement learning, Decision tree

中图分类号: 

  • TP393
[1]HAMEED A,KHOSHKBARFOROUSHHA A,RANJAN R,et al.A Survey and Taxonomy on Energy Efficient Resource Allocation Techniques for Cloud Computing Systems[J].Computing,2016,98(7):751-774.
[2]QURESHI A,WEBER R,BALAKRISHNAN H,et al.Cutting the Electric Bill for Internet-scale Systems[J].ACM SIGCOMM Computer Communication Review,2009,39(4):123-134.
[3]CALHEIROS R N,RANJAN R,BELOGLA-ZOV A,et al.CloudSim:A Toolkit for Modeling and Dimulation of Cloud Computing Environments and Evaluation of Resource Provisioning Algorithms[J].Software:Practice and Experience,2011,41(1):23-50.
[4]HU Z G,XIAO H,LI K Q.Virtual Machine Consolidation Algorithm Based on Multi-objective Optimization in Cloud Computing[J].Journal of Hunan University(Natural Sciences),2020,47(2):116-124.
[5]HIEU N T,DI FRANCESCO M,YLÄJÄ-ÄSKI A.Virtual Machine Consolidation with Multiple Usage Prediction for Energy-efficient Cloud Data Centers[J].IEEE Transactions on Services Computing,2020,13(1):186-199.
[6]YU X,LI Z Y,SUN S,et al.Adaptive Virtual Machine Consolidation Based on Deep Reinforcement Learning[J].Journal of Computer Research and Development,2021,58(12):2783-2797.
[7]PRABHA B,RAMESH K,RENJITH P N.A Review on Dynamic Virtual Machine Consolidation Approaches for Energy-Efficient Cloud Data Centers[M]//Data Intelligence and Cognitive Informatics.Springer,Singapore,2021:761-780.
[8]WANG K,QU H,ZHAO J H.Multi-objective OptimizationMethod Based on Reinforcement Learning in Multi-domain SFC Development[J].Computer Science,2021,48(12):324-330.
[9]XIE S Q,CHEN Z T,XU C,et al.Environment Upgrade Reinforcement Learning for Non-differentiable Multi-stage Pipelines[J].Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition),2020,32(5):857-858.
[10]SUTTON R S,BARTO A G.Reinforcement Learning:an Introduction[M].Massachusetts:MIT Press,2018.
[11]CHENG Z K,YAN X L,CHENG W S,et al.Research on Coke Quality Prediction Model Based on Gradient Boosting Decision Tree[J].Journal of Chongqing Technology and Business University(Natural Science Edition),2021,38(5):55-60.
[12]VAN HASSELT H,GUEZ A,SILVER D.Deep Reinforcement Learning with Double Q-learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Phoenix,Arizona,USA:AAAI Press,2016:2094-2100.
[13]LILLICRAP T P,HUNT J J,PRITZEL A,et al.ContinuousControl with Deep Reinforcement Learning[J].arXiv:1509.02971,2015.
[14]FUJIMOTO S,VAN HOOF H,MEGER D.Addressing Function Approximation Error in Actor-critic Methods [C]//International Conference on Machine Learning(ICML).PMLR,2018:1587-1596.
[15]ABDULLAH M,LU K,WIEDER P.A Heuristic-Based Approach for Dynamic Vms Consolidation in Cloud Data Centers[J].Arabian Journal for Science and Engineering,2017,42(8):3535-3549
[16]BELOGLAZOV A,BUYYA R.Optimal Online DeterministicAlgorithms and Adaptive Heuristics for Energy and Perfor-mance Efficient Dynamic Consolidation of Virtual Machines in Cloud Data Centers[J].Concurrency and Computation:Practice and Experience,2012,24(13):1397-1420.
[17]FARAHNAKIAN F,ASHRAF A, PAHIKKALA T. UsingAnt Colony System to Consolidate VMs for Green Cloud Computing[J].IEEE Transactions on Services Computing,2015,8(2):187-198.
[18]SINGH N,DHIR V.Hypercube Based Genetic Algorithm for Efficient Vm Migration for Energy Reduction in Cloud Computing [J].Statistics, Optimization & Information Computing,2019,7(2):468-485.
[19]ZHANG Y,WANG Y,WANG H.Energy-Efficient Task Sche-duling for DVFS-enabled Heterogeneous Computing Systems Using a Linear Programming Approach[C]//2016 IEEE 35th International Performance Computing and Communications Conference(IPCCC).IEEE,2016:1-8.
[20]ANASTASOPOULOS M,TZANAKAKI A,SIMEONIDOU D.Stochastic Energy Efficient Cloud Service Provisioning Deploying Renewable Energy Sources[J].IEEE Journal on Selected Areas in Communications,2016,34(12):3927-3940.
[21]RASOULI N,RAZAVI R,FARAGARDI H R.EPBLA:Energy-efficient Consolidation of Virtual Machines Using Learning Automata in Cloud Data Centers[J].Cluster Computing,2020,23(4):3013-3027.
[22]MA Z J.Research on Energy-aware Virtual Machine Consolidation Technology in Cloud Computing Environment[D].Guangzhou:South China University of Technology,2021.
[23]CHEN T.Research on Dynamic Integration Strategy of Virtual Machine Based on MOPOS Algorithm[D].Xi'an:Xidian University,2021.
[24]HAGHSHENAS K,PAHLEVAN A,ZAPATER M,et al.Magnetic:Multi-agent Machine Learning-based Approach for Energy Efficient Dynamic Consolidation in Data Centers[J].IEEE Transactions on Services Computing,2022,15(1):30-44.
[25]DING W,LUO F,GU C,et al.Performance-to-power RatioAware Resource Consolidation Framework based on Reinforcement Learning in Cloud Data Centers[J].IEEE Access,2020,8:15472-15483.
[26]THEIN T,MYO M M,PARVIN S,et al.Reinforcement Lear-ning based Methodology for Energy-efficient Resource Allocation in Cloud DataCenters[J].Journal of King Saud University-Computer and Information Sciences,2020,32(10):1127-1139.
[27]HUANG N X,YIN X,YUE Y L,et al.An Improved Deep Reinforcement Learning Algorithm Based on Meta-learning[J].Journal of Yangzhou University(Natural Science Edition),2021,24(3):19-23.
[28]FAN J Y,LIU Q.Off-policy MaximumEntropy Deep Reinforcement Learning Algorithm Based on Randomly Weighted Triple Q-Learning[J].Computer Science,2022,49(6):335-341.
[29]MASOUMZADEH S S,HLAVACS H.Int-egrating VM Selection Criteria in Distributed Dynamic VM Consolidation Using Fuzzy Q-Learning[C]//Proceedings of the 9th International Conference on Network and Service Management(CNSM).IEEE,2013:332-338.
[30]KUSIC D,KEPHART J O,HANSON J E,et al.[J].Cluster Computing,2009,12(1):1-15.
[31]BELLEMARE M G,DABNEY W,MUNOS R.A Distributional Perspective on Reinforcement Learning[C]//International Conference on Machine Learning(ICML).PMLR,2017:449-458.
[32]OU D X,ZHANG X Y,ZHAO Y,et al.Urban Rain Transit Train Accident Delay Time Prediction Based on GBDT Cascade Classification Method[J].Urban Mass Transit,2022,25(10):65-70.
[33]YIN C Y,SHAO C F,HUANG Z G,et al.Investigating Influences of Multi-scale Built Environment on Car Ownership Behavior Based on Gradient Boosting Decision Trees[J].Journal of Jilin University(Engineering and Technology Edition),2022,52(3):572-577.
[34]LIU J,ZHAO J,FENG Y M,et al.Power Load Forecasting in Power Internet of Things Based on Gradient Boosting Decision Tree[J].Smart Power,2022,50(8):46-53.
[35]PROKHORENKOVA L,GUSEV G,VOROBEV A,et al.CatBoost:Unbiased Boosting with Categorical Features [C]//Advances in Neural Information Processing Systems(NIPS).2018:1-11.
[36]GORISHNIY Y,RUBACHEV I,KHRU-LKOV V,et al.Revi-siting Deep Learning Models for Tabular Data[J].arXiv:2106.11959,2021.
[37]HABIBA,KHAN M I.Reinforcement Learning based Auto-nomic Virtual Machine Management in Clouds[C]//2016 5th International Conference on Informatics,Electronics and Vision(ICIEV).IEEE,2016:1083-1088.
[38]CHENG Y,CHAI Z,ANWAR A.Characteri-zing Co-located Datacenter Workloads:An Alibaba Case Study[C]//Procee-dings of the 9th Asia-Pacific Workshop on Systems.2018:1-3.
[39]WANG Z,SCHAUL T,HESSEL M,et al.Dueling Network Architectures for Deep Reinforcement Learning[C]//International Conference on Machine Learning(ICML).PMLR,2016:1995-2003.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!