计算机科学 ›› 2019, Vol. 46 ›› Issue (9): 85-92.doi: 10.11896/j.issn.1002-137X.2019.09.011

• 第35届中国数据库学术会议 • 上一篇    下一篇

基于随机森林的虚拟机性能预测与配置优化

张彬彬, 王娟, 岳昆, 武浩, 郝佳   

  1. (云南大学信息学院 昆明650500)
  • 收稿日期:2018-07-21 出版日期:2019-09-15 发布日期:2019-09-02
  • 通讯作者: 岳 昆(1979-),男,博士,教授,博士生导师,CCF高级会员,主要研究方向为海量数据分析与服务、大数据知识工程,E-mail:kyue@ynu.edu.cn
  • 作者简介:张彬彬(1982-),女,博士,讲师,CCF会员,主要研究方向为虚拟化、云计算;王 娟(1992-),女,硕士生,主要研究方向为数据分析、虚拟化;武 浩(1979-),男,博士,副教授,主要研究方向为Web信息处理、服务计算;郝 佳(1993-),女,博士生,主要研究方向为数据分析、虚拟化。
  • 基金资助:
    本文受国家自然科学基金项目(61402398,U1802271,61562090),云南大学青年英才培育计划项目(WX173602)

Performance Prediction and Configuration Optimization of Virtual Machines Based on Random Forest

ZHANG Bin-bin, WANG Juan, YUE Kun, WU Hao, HAO Jia   

  1. (School of Information Science and Engineering,Yunnan University,Kunming 650500,China);
  • Received:2018-07-21 Online:2019-09-15 Published:2019-09-02

摘要: 在目前的IaaS云计算服务中,用户可租用不同资源配置的虚拟机,然而用户很难根据资源配置准确估计虚拟机的性能,从而较难根据待部署的应用的性能需求选择恰当配置的虚拟机,这种使用方式使得云主机的资源未得到最充分的利用。因此,文中提出基于随机森林回归模型预测特定配置的虚拟机性能,并在此基础上,根据性能需求,利用遗传算法求解较优的符合性能需求的虚拟机配置,用随机森林性能模型获取种群中各个体的性能预测值以选出最接近性能需求的个体进行交叉操作。实验结果表明,随机森林回归模型能准确预测特定配置的虚拟机的性能,利用遗传算法搜索得出的虚拟机配置的实测性能与性能需求非常接近,并且该算法可以在较短时间内达到收敛。

关键词: 配置优化, 随机森林, 性能预测, 虚拟机, 遗传算法, 云计算

Abstract: In IaaS cloud computing,users rent one or more virtual machines with different resource configurations.However,it is difficult for users to accurately estimate the performance of the virtual machine according to the resources allocated.Thus it is hard for them to select an appropriate virtual machine according to the performance requirement of the applications.Therefore,this paper proposed to predict performance of the virtual machine according to their resources and configurations based on random forest.Further,it proposed to use genetic algorithm to search the optimal configuration of the virtual machine which can meet the performance requirement.The difference of the prediction result and the target performance are used as the fitness function.The experimental results show that the random forest model can accurately predict performance of the virtual machine.And the actual performance of the virtual machine configured according to the configuration obtained by the genetic algorithm is very close to the performance requirement,and the convergence can be achieved in a short time.

Key words: Cloud computing, Configuration optimization, Genetic algorithm, Performance prediction, Random forest, Virtual machine

中图分类号: 

  • TP302
[1]KUNDU S,RANGASWAMI R,DUTTA K.Application per-formance modeling in a virtualized environment[C]//16th International Symposium on High Performance Computer Architecture (HPCA).2010.
[2]BROSIG F,GORSLER F,HUBER N.Evaluating Approaches for Performance Prediction in Virtualized Environments[C]//Proceedings of 21st International symposium on Modeling,Analysis and Simulation of Computer and Telecommunication Systems.2013.
[3]KRAFT S,CASALE G,KRISHNAMURTHY D.I/O perfor-mance prediction in consolidated virtualized environments[J].Acm Sigmetrics Performance Evaluation Review,2011,39(3):17-18.
[4]MENG F,DU G,HE H.Performance Modeling on the Basis of Application Type in Virtualized Environments[J].Journal of Software,2013,8(11):2847-2854.
[5]LI F Z,YANG D,ZHOU P,et al.Modeling Application Performance in a Virtualized Environment[J].Computer Systems &Applications,2015,24(9):9-15.(in Chinese)黎丰泽,杨达,周鹏,等.虚拟环境下虚拟机应用性能建模[J].计算机系统应用,2015,24(9):9-15.
[6]贝振东,喻之斌,熊文,等.一种云计算系统中虚拟机的性能预测方法及系统:中国,CN104536829A[P].2018-04-22.
[7]CHEIKH B,DONCEL J,BRUN O,et al.Predicting ResponseTimes of Applications in Virtualized Environments[J].International Journal of Humanoid Robotics,2016,12(3):83-90.
[8]XU J,FORTES J A.Multi-objective virtual machine placement in virtualized data center environments[C]//IEEE/ACM International Conference on Green Computing and Communications &International Conference on Cyber,Physical and Social Computing.2010.
[9]WU G,TANG M,TIAN Y C,et al.Energy-efficient virtual machine placement in data centers by genetic algorithm[C]//Neural Information Processing.Springer,2012:315-323.
[10]TANG M,PAN S,A Hybrid Genetic Algorithm for the Energy-Efficient Virtual Machine Placement Problem in Data Centers[J].Neural Processing Letters,2015,41(2):211-221.
[11]YANG T,LEE Y C,ZOMAYA A Y.Energy-Efficient DataCenter Networks Planning with Virtual Machine Placement and Traffic Configuration[J].IEEE 6th International Conference on Cloud Computing Technology and Science.2014.
[12]KAAOUACHE M A,BOUAMAMA S.Solving bin PackingProblem with a Hybrid Genetic Algorithm for VM Placement in Cloud[J].Procedia Computer Science,2015,60(1):1061-1069.
[13]JIANG B,LI R,et al.An improved genetic algorithm for loadbalance in multiprocessor systems[C]//Proceedings of the 14th International Conference on Advanced Communication Techno-logy.2012.
[14]WANG T,LIU Z,CHEN Y,et al.Load balancing task scheduling based on genetic algorithm in cloud computing[C]//Proceedings of 12th International Conference on Dependable,Autonomic and Secure Computing.2014.
[15]SUNDARARAJAN P K,FELLERY E,FORGEATY J,et al.A Constrained Genetic Algorithm for Rebalancing of Services in Cloud Data Centers[C]//Proceedings of the 8th International Conference on Cloud Computing.2015.
[16]MI H B,WANG H M,YIN G,et al.Resource On-DemandReconfiguration Method for Virtualized Data Centers[J].Journal of Software,2011,22(9):2193-2205.(in Chihese)米海波,王怀民,尹刚,等.一种面向虚拟化数字中心资源按需重配置方法[J].软件学报,2011,22(9):2193-2205.
[1] 高振卓, 王志海, 刘海洋.
嵌入典型时间序列特征的随机Shapelet森林算法
Random Shapelet Forest Algorithm Embedded with Canonical Time Series Features
计算机科学, 2022, 49(7): 40-49. https://doi.org/10.11896/jsjkx.210700226
[2] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[3] 杨浩雄, 高晶, 邵恩露.
考虑一单多品的外卖订单配送时间的带时间窗的车辆路径问题
Vehicle Routing Problem with Time Window of Takeaway Food ConsideringOne-order-multi-product Order Delivery
计算机科学, 2022, 49(6A): 191-198. https://doi.org/10.11896/jsjkx.210400005
[4] 蒋成满, 华保健, 樊淇梁, 朱洪军, 徐波, 潘志中.
Python虚拟机本地代码的安全性实证研究
Empirical Security Study of Native Code in Python Virtual Machines
计算机科学, 2022, 49(6A): 474-479. https://doi.org/10.11896/jsjkx.210600200
[5] 阙华坤, 冯小峰, 刘盼龙, 郭文翀, 李健, 曾伟良, 范竞敏.
Grassberger熵随机森林在窃电行为检测的应用
Application of Grassberger Entropy Random Forest to Power-stealing Behavior Detection
计算机科学, 2022, 49(6A): 790-794. https://doi.org/10.11896/jsjkx.210800032
[6] 王文强, 贾星星, 李朋.
自适应的集成定序算法
Adaptive Ensemble Ordering Algorithm
计算机科学, 2022, 49(6A): 242-246. https://doi.org/10.11896/jsjkx.210200108
[7] 赵航, 童水光, 朱郑州.
基于数据学习的结构静力学性能预测方法
Prediction Method of Structural Static Performance Based on Data Learning
计算机科学, 2022, 49(4): 140-143. https://doi.org/10.11896/jsjkx.210300238
[8] 章晓庆, 方建生, 肖尊杰, 陈浜, RisaHIGASHITA, 陈婉, 袁进, 刘江.
基于眼前节相干光断层扫描成像的核性白内障分类算法
Classification Algorithm of Nuclear Cataract Based on Anterior Segment Coherence Tomography Image
计算机科学, 2022, 49(3): 204-210. https://doi.org/10.11896/jsjkx.201100085
[9] 高诗尧, 陈燕俐, 许玉岚.
云环境下基于属性的多关键字可搜索加密方案
Expressive Attribute-based Searchable Encryption Scheme in Cloud Computing
计算机科学, 2022, 49(3): 313-321. https://doi.org/10.11896/jsjkx.201100214
[10] 沈彪, 沈立炜, 李弋.
空间众包任务的路径动态调度方法
Dynamic Task Scheduling Method for Space Crowdsourcing
计算机科学, 2022, 49(2): 231-240. https://doi.org/10.11896/jsjkx.210400249
[11] 刘振宇, 宋晓莹.
一种可用于分类型属性数据的多变量回归森林
Multivariate Regression Forest for Categorical Attribute Data
计算机科学, 2022, 49(1): 108-114. https://doi.org/10.11896/jsjkx.201200189
[12] 杨小琴, 刘国军, 郭建慧, 马文涛.
基于随机森林的空域-频域联合特征全参考彩色图像质量评价方法
Full Reference Color Image Quality Assessment Method Based on Spatial and Frequency Domain Joint Features with Random Forest
计算机科学, 2021, 48(8): 99-105. https://doi.org/10.11896/jsjkx.200700106
[13] 吴善杰, 王新.
基于AGA-DBSCAN优化的RBF神经网络构造煤厚度预测方法
Prediction of Tectonic Coal Thickness Based on AGA-DBSCAN Optimized RBF Neural Networks
计算机科学, 2021, 48(7): 308-315. https://doi.org/10.11896/jsjkx.200800110
[14] 郑建华, 李小敏, 刘双印, 李迪.
融合级联上采样与下采样的改进随机森林不平衡数据分类算法
Improved Random Forest Imbalance Data Classification Algorithm Combining Cascaded Up-sampling and Down-sampling
计算机科学, 2021, 48(7): 145-154. https://doi.org/10.11896/jsjkx.200800120
[15] 王政, 姜春茂.
一种基于三支决策的云任务调度优化算法
Cloud Task Scheduling Algorithm Based on Three-way Decisions
计算机科学, 2021, 48(6A): 420-426. https://doi.org/10.11896/jsjkx.201000023
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!