Deep Ranking Ensembles for Hyperparameter Optimization
- URL: http://arxiv.org/abs/2303.15212v2
- Date: Sun, 21 May 2023 13:31:47 GMT
- Title: Deep Ranking Ensembles for Hyperparameter Optimization
- Authors: Abdus Salam Khazi, Sebastian Pineda Arango, Josif Grabocka
- Abstract summary: We present a novel method that meta-learns neural network surrogates optimized for ranking the configurations' performances while modeling their uncertainty via ensembling.
In a large-scale experimental protocol comprising 12 baselines, 16 HPO search spaces and 86 datasets/tasks, we demonstrate that our method achieves new state-of-the-art results in HPO.
- Score: 9.453554184019108
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatically optimizing the hyperparameters of Machine Learning algorithms
is one of the primary open questions in AI. Existing work in Hyperparameter
Optimization (HPO) trains surrogate models for approximating the response
surface of hyperparameters as a regression task. In contrast, we hypothesize
that the optimal strategy for training surrogates is to preserve the ranks of
the performances of hyperparameter configurations as a Learning to Rank
problem. As a result, we present a novel method that meta-learns neural network
surrogates optimized for ranking the configurations' performances while
modeling their uncertainty via ensembling. In a large-scale experimental
protocol comprising 12 baselines, 16 HPO search spaces and 86 datasets/tasks,
we demonstrate that our method achieves new state-of-the-art results in HPO.
Related papers
- Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction.
Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z) - AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
Hyper-parameter Tuning [72.54359545547904]
We propose a gradient-based subset selection framework for hyper- parameter tuning.
We show that using gradient-based data subsets for hyper- parameter tuning achieves significantly faster turnaround times and speedups of 3$times$-30$times$.
arXiv Detail & Related papers (2022-03-15T19:25:01Z) - Supervising the Multi-Fidelity Race of Hyperparameter Configurations [22.408069485293666]
We introduce DyHPO, a Bayesian Optimization method that learns to decide which hyperparameter configuration to train further in a race among all feasible configurations.
We demonstrate the significant superiority of DyHPO against state-of-the-art hyperparameter optimization methods through large-scale experiments.
arXiv Detail & Related papers (2022-02-20T10:28:02Z) - Hyper-parameter optimization based on soft actor critic and hierarchical
mixture regularization [5.063728016437489]
We model hyper- parameter optimization process as a Markov decision process, and tackle it with reinforcement learning.
A novel hyper- parameter optimization method based on soft actor critic and hierarchical mixture regularization has been proposed.
arXiv Detail & Related papers (2021-12-08T02:34:43Z) - Towards Robust and Automatic Hyper-Parameter Tunning [39.04604349338802]
We introduce a new class of HPO method and explore how the low-rank factorization of intermediate layers of a convolutional network can be used to define an analytical response surface.
We quantify how this surface behaves as a surrogate to model performance and can be solved using a trust-region search algorithm, which we call autoHyper.
arXiv Detail & Related papers (2021-11-28T05:27:34Z) - Optimizing Large-Scale Hyperparameters via Automated Learning Algorithm [97.66038345864095]
We propose a new hyperparameter optimization method with zeroth-order hyper-gradients (HOZOG)
Specifically, we first formulate hyperparameter optimization as an A-based constrained optimization problem.
Then, we use the average zeroth-order hyper-gradients to update hyper parameters.
arXiv Detail & Related papers (2021-02-17T21:03:05Z) - Online hyperparameter optimization by real-time recurrent learning [57.01871583756586]
Our framework takes advantage of the analogy between hyperparameter optimization and parameter learning in neural networks (RNNs)
It adapts a well-studied family of online learning algorithms for RNNs to tune hyperparameters and network parameters simultaneously.
This procedure yields systematically better generalization performance compared to standard methods, at a fraction of wallclock time.
arXiv Detail & Related papers (2021-02-15T19:36:18Z) - Few-Shot Bayesian Optimization with Deep Kernel Surrogates [7.208515071018781]
We propose a few-shot learning problem in which we train a shared deep surrogate model to adapt to the response function of a new task.
We propose the use of a deep kernel network for a Gaussian process surrogate that is meta-learned in an end-to-end fashion.
As a result, the novel few-shot optimization of our deep kernel surrogate leads to new state-of-the-art results at HPO.
arXiv Detail & Related papers (2021-01-19T15:00:39Z) - An Asymptotically Optimal Multi-Armed Bandit Algorithm and
Hyperparameter Optimization [48.5614138038673]
We propose an efficient and robust bandit-based algorithm called Sub-Sampling (SS) in the scenario of hyper parameter search evaluation.
We also develop a novel hyper parameter optimization algorithm called BOSS.
Empirical studies validate our theoretical arguments of SS and demonstrate the superior performance of BOSS on a number of applications.
arXiv Detail & Related papers (2020-07-11T03:15:21Z) - Automatic Setting of DNN Hyper-Parameters by Mixing Bayesian
Optimization and Tuning Rules [0.6875312133832078]
We build a new algorithm for evaluating and analyzing the results of the network on the training and validation sets.
We use a set of tuning rules to add new hyper-parameters and/or to reduce the hyper- parameter search space to select a better combination.
arXiv Detail & Related papers (2020-06-03T08:53:48Z) - HyperSTAR: Task-Aware Hyperparameters for Deep Networks [52.50861379908611]
HyperSTAR is a task-aware method to warm-start HPO for deep neural networks.
It learns a dataset (task) representation along with the performance predictor directly from raw images.
It evaluates 50% less configurations to achieve the best performance compared to existing methods.
arXiv Detail & Related papers (2020-05-21T08:56:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.