Supervising the Multi-Fidelity Race of Hyperparameter Configurations
- URL: http://arxiv.org/abs/2202.09774v2
- Date: Thu, 1 Jun 2023 08:55:35 GMT
- Title: Supervising the Multi-Fidelity Race of Hyperparameter Configurations
- Authors: Martin Wistuba, Arlind Kadra, Josif Grabocka
- Abstract summary: We introduce DyHPO, a Bayesian Optimization method that learns to decide which hyperparameter configuration to train further in a race among all feasible configurations.
We demonstrate the significant superiority of DyHPO against state-of-the-art hyperparameter optimization methods through large-scale experiments.
- Score: 22.408069485293666
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-fidelity (gray-box) hyperparameter optimization techniques (HPO) have
recently emerged as a promising direction for tuning Deep Learning methods.
However, existing methods suffer from a sub-optimal allocation of the HPO
budget to the hyperparameter configurations. In this work, we introduce DyHPO,
a Bayesian Optimization method that learns to decide which hyperparameter
configuration to train further in a dynamic race among all feasible
configurations. We propose a new deep kernel for Gaussian Processes that embeds
the learning curve dynamics, and an acquisition function that incorporates
multi-budget information. We demonstrate the significant superiority of DyHPO
against state-of-the-art hyperparameter optimization methods through
large-scale experiments comprising 50 datasets (Tabular, Image, NLP) and
diverse architectures (MLP, CNN/NAS, RNN).
Related papers
- Parameter Optimization with Conscious Allocation (POCA) [4.478575931884855]
Hyperband-based approaches to machine learning are among the most effective.
We present.
the new.
Optimization with Conscious Allocation (POCA), a hyperband-based algorithm that adaptively allocates the inputted.
budget to the hyperparameter configurations it generates.
POCA finds strong configurations faster in both settings.
arXiv Detail & Related papers (2023-12-29T00:13:55Z) - PriorBand: Practical Hyperparameter Optimization in the Age of Deep
Learning [49.92394599459274]
We propose PriorBand, an HPO algorithm tailored to Deep Learning (DL) pipelines.
We show its robustness across a range of DL benchmarks and show its gains under informative expert input and against poor expert beliefs.
arXiv Detail & Related papers (2023-06-21T16:26:14Z) - Deep Ranking Ensembles for Hyperparameter Optimization [9.453554184019108]
We present a novel method that meta-learns neural network surrogates optimized for ranking the configurations' performances while modeling their uncertainty via ensembling.
In a large-scale experimental protocol comprising 12 baselines, 16 HPO search spaces and 86 datasets/tasks, we demonstrate that our method achieves new state-of-the-art results in HPO.
arXiv Detail & Related papers (2023-03-27T13:52:40Z) - Pre-training helps Bayesian optimization too [49.28382118032923]
We seek an alternative practice for setting functional priors.
In particular, we consider the scenario where we have data from similar functions that allow us to pre-train a tighter distribution a priori.
Our results show that our method is able to locate good hyper parameters at least 3 times more efficiently than the best competing methods.
arXiv Detail & Related papers (2022-07-07T04:42:54Z) - Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction.
Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z) - AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z) - Optimizing Large-Scale Hyperparameters via Automated Learning Algorithm [97.66038345864095]
We propose a new hyperparameter optimization method with zeroth-order hyper-gradients (HOZOG)
Specifically, we first formulate hyperparameter optimization as an A-based constrained optimization problem.
Then, we use the average zeroth-order hyper-gradients to update hyper parameters.
arXiv Detail & Related papers (2021-02-17T21:03:05Z) - Adaptive pruning-based optimization of parameterized quantum circuits [62.997667081978825]
Variisy hybrid quantum-classical algorithms are powerful tools to maximize the use of Noisy Intermediate Scale Quantum devices.
We propose a strategy for such ansatze used in variational quantum algorithms, which we call "Efficient Circuit Training" (PECT)
Instead of optimizing all of the ansatz parameters at once, PECT launches a sequence of variational algorithms.
arXiv Detail & Related papers (2020-10-01T18:14:11Z) - Multi-level Training and Bayesian Optimization for Economical
Hyperparameter Optimization [12.92634461859467]
In this paper, we develop an effective approach to reducing the total amount of required training time for Hyperparameter Optimization.
We propose a truncated additive Gaussian process model to calibrate approximate performance measurements generated by light training.
Based on the model, a sequential model-based algorithm is developed to generate the performance profile of the configuration space as well as find optimal ones.
arXiv Detail & Related papers (2020-07-20T09:03:02Z) - Automatic Setting of DNN Hyper-Parameters by Mixing Bayesian
Optimization and Tuning Rules [0.6875312133832078]
We build a new algorithm for evaluating and analyzing the results of the network on the training and validation sets.
We use a set of tuning rules to add new hyper-parameters and/or to reduce the hyper- parameter search space to select a better combination.
arXiv Detail & Related papers (2020-06-03T08:53:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.