面向超参数估计的贝叶斯优化方法综述

doi:10.11896/jsjkx.210300208

Abstract

Abstract: For most machine learning models,hyper-parameter selection plays an important role in obtaining high quality models.In the current practice,most of the hyper-parameters are given manually.So the selection or estimation of hyper-parameters is an key issue in machine learning.The mapping from hyper-parameter set to the modeĹs generalization can be regarded as a complex black box function.The general optimization method is difficult to apply.Bayesian optimization is a very effective global optimization algorithm,which is suitable for solving optimization problems in which their objective functions could not be expressed,or the functions are non-convex,computational expensive.The ideal solution can be obtained with a few function evaluations.This paper summarizes the basics of the Bayesian optimization based on hyper-parameter estimation methods,and summarizes the research hot spots and the latest developments in the recent years,including the researches in agent model,acquisition function,algorithm implementation and so on.And the problems to be solved in existing research are summarized.It is expected to help beginners quickly understand Bayesian optimization algorithms,understand typical algorithm ideas,and play a guiding role in future researches.

Key words: Bayesian optimization, Black box optimization, Hyper-parameters, Machine learning, Probabilistic surrogate model

CLC Number:

TP181

LI Ya-ru, ZHANG Yu-lai, WANG Jia-chen. Survey on Bayesian Optimization Methods for Hyper-parameter Tuning[J].Computer Science, 2022, 49(6A): 86-92.

References

[1] SNOEK J,LAROCHELLE H,Adams R P.Practical bayesian optimization of machine learning algorithms[J].arXiv:1206.2944,2012.
[2] BROCHU E,CORA V M,DE FREITAS N.A tutorial onBayesian optimization of expensive cost functions,with application to active user modeling and hierarchical reinforcement learning[J].arXiv:1012.2599,2010.
[3] LETHAM B,KARRER B,OTTONI G,et al.ConstrainedBayesian optimization with noisy experiments[J].Bayesian Analysis,2019,14(2):495-519.
[4] BERGSTRA J,BARDENET R,BENGIO Y,et al.Algorithms for hyper-parameter optimization[C]//25th Annual Conference on Neural Information Processing Systems(NIPS 2011).Neural Information Processing Systems Foundation,2011.
[5] BAO Y,LIU Z.A fast grid search method in support vector regression forecasting time series[C]//International Conference on Intelligent Data Engineering and Automated Learning.Berlin:Springer,2006:504-511.
[6] BERGSTRA J,BENGIO Y.Random search for hyper-parameter optimization[J].Journal of Machine Learning Research,2012,13(1):281-305.
[7] PELIKAN M,GOLDBERG D E,CANTÚ-PAZ E.BOA:TheBayesian optimization algorithm[C]//Proceedings of the Gene-tic and Evolutionary Computation Conference(GECCO-99).1999:525-532.
[8] FRAZIER P I.A tutorial on Bayesian optimization[J].arXiv:1807.02811,2018.
[9] SHAHRIARI B,SWERSKY K,WANG Z,et al.Taking the human out of the loop:A review of Bayesian optimization[C]//Proceedings of the IEEE.2015:148-175.
[10] MAHENDRAN N,WANG Z,HAMZE F,et al.Adaptive MCMC with Bayesian optimization[C]//Artificial Intelligence and Statistics.PMLR,2012:751-760.
[11] JONES D R,SCHONLAU M,WELCH W J.Efficient global optimization of expensive black-box functions[J].Journal of GlobalOptimization,1998,13(4):455-492.
[12] JIANG M.Research and Application of Bayesian Optimization algorithm[D].Shanghai:Shanghai University,2012.
[13] RASMUSSEN C E.Gaussian processes in machine learning[C]//Summer School on Machine Learning.Berlin:Springer,2003:63-71.
[14] SRINIVAS N,KRAUSE A,KAKADE S M,et al.Gaussianprocess optimization in the bandit setting:No regret and experimental design[J].arXiv:0912.3995,2009.
[15] THORNTON C,HUTTER F,HOOS H H,et al.Auto-WEKA:Combined selection and hyperparameter optimization of classification algorithms[C]//Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2013:847-855.
[16] GARRIDO-MERCHÁN E C,HERNÁNDEZ-LOBATO D.Dealing with categorical and integer-valued variables in bayesian optimization with gaussian processes[J].Neurocomputing,2020,380:20-35.
[17] TOSCANO-PALMERIN S,FRAZIER P I.Bayesian optimiza-tion with expensive integrands[J].arXiv:1803.08661,2018.
[18] ASTUDILLO R,FRAZIER P.Bayesian optimization of compo-site functions[C]//International Conference on Machine Lear-ning.PMLR,2019:354-363.
[19] KANDASAMY K,SCHNEIDER J,PÓCZOS B.High dimen-sional Bayesian optimisation and bandits via additive models[C]//International Conference on Machine Learning.PMLR,2015:295-304.
[20] LI C L,KANDASAMY K,PÓCZOS B,et al.High dimensional Bayesian optimization via restricted projection pursuit models[C]//Artificial Intelligence and Statistics.PMLR,2016:884-892.
[21] ROLLAND P,SCARLETT J,BOGUNOVIC I,et al.High-di-mensional Bayesian optimization via additive models with overlapping groups[C]//International Conference on Artificial Intelligence and Statistics.PMLR,2018:298-307.
[22] WANG Z,LI C,JEGELKA S,et al.Batched high-dimensional Bayesian optimization via structural kernel learning[C]//International Conference on Machine Learning.PMLR,2017:3656-3664.
[23] WILLIAMS C,BONILLA E V,CHAI K M.Multi-task Gaussian process prediction[C]//Advances in Neural Information Processing Systems.2007:153-160.
[24] SWERSKY K,SNOEK J,ADAMS R P.Multi-task bayesian optimization[C]//Advances in Neural Information Processing Systems.2013.
[25] DJOLONGA J,KRAUSE A,CEVHER V.High-dimensionalgaussian process bandits[C]//Neural Information Processing Systems.2013.
[26] NAYEBI A,MUNTEANU A,POLOCZEK M.A framework for Bayesian optimization in embedded subspaces[C]//International Conference on Machine Learning.PMLR,2019:4752-4761.
[27] KIRSCHNER J,MUTNY M,HILLER N,et al.Adaptive andsafe Bayesian optimization in high dimensions via one-dimensional subspaces[C]//International Conference on Machine Learning.PMLR,2019:3429-3438.
[28] HENNIG P,SCHULER C J.Entropy Search for Information-Efficient Global Optimization[J].arXiv:1112.1217,2012.
[29] HERNÁNDEZ-LOBATO J M,HOFFMAN M W,GHAHRAMANI Z.Predictive entropy search for efficient global optimization of black-box functions[J].arXiv:1406.2541,2014.
[30] HERNÁNDEZ-LOBATO D,HERNANDEZ-LOBATO J,SHAHA,et al.Predictive entropy search for multi-objective bayesian optimization[C]//International Conference on Machine Learning.PMLR,2016:1492-1501.
[31] MOSS H B,LESLIE D S,RAYSON P.Mumbo:Multi-task max-value bayesian optimization[J].arXiv:2006.12093,2020.
[32] WANG Z,GEHRING C,KOHLI P,et al.Batched large-scale bayesian optimization in high-dimensional spaces[C]//International Conference on Artificial Intelligence and Statistics.PMLR,2018:745-754.
[33] CONTAL E,BUFFONI D,ROBICQUET A,et al.ParallelGaussian process optimization with upper confidence bound and pure exploration[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases.Berlin:Springer,2013:225-240.
[34] LYU W,YANG F,YAN C,et al.Batch bayesian optimization via multi-objective acquisition ensemble for automated analog circuit design[C]//International Conference on Machine Lear-ning.PMLR,2018:3306-3314.
[35] PARIA B,KANDASAMY K,PÓCZOS B.A flexible framework for multi-objective Bayesian optimization using random scalari-zations[C]//Uncertainty in Artificial Intelligence.PMLR,2020:766-776.
[36] GONG C,PENG J,LIU Q.Quantile stein variational gradient descent for batch bayesian optimization[C]//International Conference on Machine Learning.PMLR,2019:2347-2356.
[37] LIU Q,WANG D.Stein variational gradient descent:A general purpose bayesian inference algorithm[J].arXiv:1608.04471,2016.
[38] SWERSKY K,SNOEK J,ADAMS R P.Freeze-thaw bayesianoptimization[J].arXiv:1406.3896,2014.
[39] PERDIKARIS P,KARNIADAKIS G E.Model inversion viamulti-fidelity Bayesian optimization:a new paradigm for parameter estimation in haemodynamics,and beyond[J].Journal of The Royal Society Interface,2016,13(118):20151107.
[40] DAI Z,YU H,LOW B K H,et al.Bayesian optimization meets Bayesian optimal stopping[C]//International Conference on Machine Learning.PMLR,2019:1496-1506.
[41] KLEIN A,FALKNER S,BARTELS S,et al.Fast bayesian optimization of machine learning hyperparameters on large datasets[C]//Artificial Intelligence and Statistics.PMLR,2017:528-536.
[42] RAMACHANDRAN A,GUPTA S,RANA S,et al.Selectingoptimal source for transfer learning in Bayesian optimisation[C]//Pacific Rim International Conference on Artificial Intelligence.Cham:Springer,2018:42-56.
[43] OH C,TOMCZAK J M,GAVVES E,et al.Combinatorialbayesian optimization using the graph cartesian product[J].ar-Xiv:1902.00448,2019.
[44] GONZÁLEZ J,DAI Z,HENNIG P,et al.Batch Bayesian optimization via local penalization[C]//Artificial Intelligence and Statistics.PMLR,2016:648-657.

Related Articles 15

[1]	LENG Dian-dian, DU Peng, CHEN Jian-ting, XIANG Yang. Automated Container Terminal Oriented Travel Time Estimation of AGV [J]. Computer Science, 2022, 49(9): 208-214.
[2]	NING Han-yang, MA Miao, YANG Bo, LIU Shi-chang. Research Progress and Analysis on Intelligent Cryptology [J]. Computer Science, 2022, 49(9): 288-296.
[3]	LI Yao, LI Tao, LI Qi-fan, LIANG Jia-rui, Ibegbu Nnamdi JULIAN, CHEN Jun-jie, GUO Hao. Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network [J]. Computer Science, 2022, 49(8): 257-266.
[4]	ZHANG Guang-hua, GAO Tian-jiao, CHEN Zhen-guo, YU Nai-wen. Study on Malware Classification Based on N-Gram Static Analysis Technology [J]. Computer Science, 2022, 49(8): 336-343.
[5]	HE Qiang, YIN Zhen-yu, HUANG Min, WANG Xing-wei, WANG Yuan-tian, CUI Shuo, ZHAO Yong. Survey of Influence Analysis of Evolutionary Network Based on Big Data [J]. Computer Science, 2022, 49(8): 1-11.
[6]	CHEN Ming-xin, ZHANG Jun-bo, LI Tian-rui. Survey on Attacks and Defenses in Federated Learning [J]. Computer Science, 2022, 49(7): 310-323.
[7]	ZHAO Lu, YUAN Li-ming, HAO Kun. Review of Multi-instance Learning Algorithms [J]. Computer Science, 2022, 49(6A): 93-99.
[8]	WANG Fei, HUANG Tao, YANG Ye. Study on Machine Learning Algorithms for Life Prediction of IGBT Devices Based on Stacking Multi-model Fusion [J]. Computer Science, 2022, 49(6A): 784-789.
[9]	XIAO Zhi-hong, HAN Ye-tong, ZOU Yong-pan. Study on Activity Recognition Based on Multi-source Data and Logical Reasoning [J]. Computer Science, 2022, 49(6A): 397-406.
[10]	YAO Ye, ZHU Yi-an, QIAN Liang, JIA Yao, ZHANG Li-xiang, LIU Rui-liang. Android Malware Detection Method Based on Heterogeneous Model Fusion [J]. Computer Science, 2022, 49(6A): 508-515.
[11]	XU Jie, ZHU Yu-kun, XING Chun-xiao. Application of Machine Learning in Financial Asset Pricing:A Review [J]. Computer Science, 2022, 49(6): 276-286.
[12]	LI Ye, CHEN Song-can. Physics-informed Neural Networks:Recent Advances and Prospects [J]. Computer Science, 2022, 49(4): 254-262.
[13]	YAO Xiao-ming, DING Shi-chang, ZHAO Tao, HUANG Hong, LUO Jar-der, FU Xiao-ming. Big Data-driven Based Socioeconomic Status Analysis:A Survey [J]. Computer Science, 2022, 49(4): 80-87.
[14]	ZHANG Ying-li, MA Jia-li, LIU Zi-ang, LIU Xin, ZHOU Rui. Overview of Vulnerability Detection Methods for Ethereum Solidity Smart Contracts [J]. Computer Science, 2022, 49(3): 52-61.
[15]	ZHANG Xiao-qing, FANG Jian-sheng, XIAO Zun-jie, CHEN Bang, Risa HIGASHITA, CHEN Wan, YUAN Jin, LIU Jiang. Classification Algorithm of Nuclear Cataract Based on Anterior Segment Coherence Tomography Image [J]. Computer Science, 2022, 49(3): 204-210.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Survey on Bayesian Optimization Methods for Hyper-parameter Tuning

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0