Hyperparameter Optimization: Foundations, Algorithms, Best Practices and
Open Challenges
- URL: http://arxiv.org/abs/2107.05847v2
- Date: Wed, 14 Jul 2021 22:34:27 GMT
- Title: Hyperparameter Optimization: Foundations, Algorithms, Best Practices and
Open Challenges
- Authors: Bernd Bischl, Martin Binder, Michel Lang, Tobias Pielok, Jakob
Richter, Stefan Coors, Janek Thomas, Theresa Ullmann, Marc Becker, Anne-Laure
Boulesteix, Difan Deng, Marius Lindauer
- Abstract summary: This paper reviews important HPO methods such as grid or random search, evolutionary algorithms, Bayesian optimization, Hyperband and racing.
It gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with ML pipelines, runtime improvements, and parallelization.
- Score: 5.139260825952818
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most machine learning algorithms are configured by one or several
hyperparameters that must be carefully chosen and often considerably impact
performance. To avoid a time consuming and unreproducible manual
trial-and-error process to find well-performing hyperparameter configurations,
various automatic hyperparameter optimization (HPO) methods, e.g., based on
resampling error estimation for supervised machine learning, can be employed.
After introducing HPO from a general perspective, this paper reviews important
HPO methods such as grid or random search, evolutionary algorithms, Bayesian
optimization, Hyperband and racing. It gives practical recommendations
regarding important choices to be made when conducting HPO, including the HPO
algorithms themselves, performance evaluation, how to combine HPO with ML
pipelines, runtime improvements, and parallelization.
Related papers
- PriorBand: Practical Hyperparameter Optimization in the Age of Deep
Learning [49.92394599459274]
We propose PriorBand, an HPO algorithm tailored to Deep Learning (DL) pipelines.
We show its robustness across a range of DL benchmarks and show its gains under informative expert input and against poor expert beliefs.
arXiv Detail & Related papers (2023-06-21T16:26:14Z) - Deep Ranking Ensembles for Hyperparameter Optimization [9.453554184019108]
We present a novel method that meta-learns neural network surrogates optimized for ranking the configurations' performances while modeling their uncertainty via ensembling.
In a large-scale experimental protocol comprising 12 baselines, 16 HPO search spaces and 86 datasets/tasks, we demonstrate that our method achieves new state-of-the-art results in HPO.
arXiv Detail & Related papers (2023-03-27T13:52:40Z) - Two-step hyperparameter optimization method: Accelerating hyperparameter
search by using a fraction of a training dataset [0.15420205433587747]
We present a two-step HPO method as a strategic solution to curbing computational demands and wait times.
We present our recent application of the two-step HPO method to the development of neural network emulators for aerosol activation.
arXiv Detail & Related papers (2023-02-08T02:38:26Z) - Multi-objective hyperparameter optimization with performance uncertainty [62.997667081978825]
This paper presents results on multi-objective hyperparameter optimization with uncertainty on the evaluation of Machine Learning algorithms.
We combine the sampling strategy of Tree-structured Parzen Estimators (TPE) with the metamodel obtained after training a Gaussian Process Regression (GPR) with heterogeneous noise.
Experimental results on three analytical test functions and three ML problems show the improvement over multi-objective TPE and GPR.
arXiv Detail & Related papers (2022-09-09T14:58:43Z) - Enhancing Explainability of Hyperparameter Optimization via Bayesian
Algorithm Execution [13.037647287689438]
We study the combination of HPO with interpretable machine learning (IML) methods such as partial dependence plots.
We propose a modified HPO method which efficiently searches for optimum global predictive performance.
Our method returns more reliable explanations of the underlying black-box without a loss of optimization performance.
arXiv Detail & Related papers (2022-06-11T07:12:04Z) - Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction.
Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z) - Automated Benchmark-Driven Design and Explanation of Hyperparameter
Optimizers [3.729201909920989]
We present a principled approach to automated benchmark-driven algorithm design applied to multi parameter HPO (MF-HPO)
First, we formalize a rich space of MF-HPO candidates that includes, but is not limited to common HPO algorithms, and then present a framework covering this space.
We challenge whether the found design choices are necessary or could be replaced by more naive and simpler ones by performing an ablation analysis.
arXiv Detail & Related papers (2021-11-29T18:02:56Z) - A survey on multi-objective hyperparameter optimization algorithms for
Machine Learning [62.997667081978825]
This article presents a systematic survey of the literature published between 2014 and 2020 on multi-objective HPO algorithms.
We distinguish between metaheuristic-based algorithms, metamodel-based algorithms, and approaches using a mixture of both.
We also discuss the quality metrics used to compare multi-objective HPO procedures and present future research directions.
arXiv Detail & Related papers (2021-11-23T10:22:30Z) - Cost-Efficient Online Hyperparameter Optimization [94.60924644778558]
We propose an online HPO algorithm that reaches human expert-level performance within a single run of the experiment.
Our proposed online HPO algorithm reaches human expert-level performance within a single run of the experiment, while incurring only modest computational overhead compared to regular training.
arXiv Detail & Related papers (2021-01-17T04:55:30Z) - Practical and sample efficient zero-shot HPO [8.41866793161234]
We provide an overview of available approaches and introduce two novel techniques to handle the problem.
The first is based on a surrogate model and adaptively chooses pairs of dataset, configuration to query.
The second is for settings where finding, tuning and testing a surrogate model is problematic, is a multi-fidelity technique combining HyperBand with submodular optimization.
arXiv Detail & Related papers (2020-07-27T08:56:55Z) - HyperSTAR: Task-Aware Hyperparameters for Deep Networks [52.50861379908611]
HyperSTAR is a task-aware method to warm-start HPO for deep neural networks.
It learns a dataset (task) representation along with the performance predictor directly from raw images.
It evaluates 50% less configurations to achieve the best performance compared to existing methods.
arXiv Detail & Related papers (2020-05-21T08:56:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.