Cost-Efficient Online Hyperparameter Optimization
- URL: http://arxiv.org/abs/2101.06590v1
- Date: Sun, 17 Jan 2021 04:55:30 GMT
- Title: Cost-Efficient Online Hyperparameter Optimization
- Authors: Jingkang Wang, Mengye Ren, Ilija Bogunovic, Yuwen Xiong, Raquel
Urtasun
- Abstract summary: We propose an online HPO algorithm that reaches human expert-level performance within a single run of the experiment.
Our proposed online HPO algorithm reaches human expert-level performance within a single run of the experiment, while incurring only modest computational overhead compared to regular training.
- Score: 94.60924644778558
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent work on hyperparameters optimization (HPO) has shown the possibility
of training certain hyperparameters together with regular parameters. However,
these online HPO algorithms still require running evaluation on a set of
validation examples at each training step, steeply increasing the training
cost. To decide when to query the validation loss, we model online HPO as a
time-varying Bayesian optimization problem, on top of which we propose a novel
\textit{costly feedback} setting to capture the concept of the query cost.
Under this setting, standard algorithms are cost-inefficient as they evaluate
on the validation set at every round. In contrast, the cost-efficient GP-UCB
algorithm proposed in this paper queries the unknown function only when the
model is less confident about current decisions. We evaluate our proposed
algorithm by tuning hyperparameters online for VGG and ResNet on CIFAR-10 and
ImageNet100. Our proposed online HPO algorithm reaches human expert-level
performance within a single run of the experiment, while incurring only modest
computational overhead compared to regular training.
Related papers
- PriorBand: Practical Hyperparameter Optimization in the Age of Deep
Learning [49.92394599459274]
We propose PriorBand, an HPO algorithm tailored to Deep Learning (DL) pipelines.
We show its robustness across a range of DL benchmarks and show its gains under informative expert input and against poor expert beliefs.
arXiv Detail & Related papers (2023-06-21T16:26:14Z) - Online Learning and Optimization for Queues with Unknown Demand Curve
and Service Distribution [26.720986177499338]
We investigate an optimization problem in a queueing system where the service provider selects the optimal service fee p and service capacity mu.
We develop an online learning framework that automatically incorporates the parameter estimation errors in the solution prescription process.
arXiv Detail & Related papers (2023-03-06T08:47:40Z) - A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization [57.450449884166346]
We propose an adaptive HPO method to account for the privacy cost of HPO.
We obtain state-of-the-art performance on 22 benchmark tasks, across computer vision and natural language processing, across pretraining and finetuning.
arXiv Detail & Related papers (2022-12-08T18:56:37Z) - Enhancing Explainability of Hyperparameter Optimization via Bayesian
Algorithm Execution [13.037647287689438]
We study the combination of HPO with interpretable machine learning (IML) methods such as partial dependence plots.
We propose a modified HPO method which efficiently searches for optimum global predictive performance.
Our method returns more reliable explanations of the underlying black-box without a loss of optimization performance.
arXiv Detail & Related papers (2022-06-11T07:12:04Z) - Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction.
Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z) - Optimal Parameter-free Online Learning with Switching Cost [47.415099037249085]
-freeness in online learning refers to the adaptivity of an algorithm with respect to the optimal decision in hindsight.
In this paper, we design such algorithms in the presence of switching cost - the latter penalizes the optimistic updates required by parameter-freeness.
We propose a simple yet powerful algorithm for Online Linear Optimization (OLO) with switching cost, which improves the existing suboptimal regret bound [ZCP22a] to the optimal rate.
arXiv Detail & Related papers (2022-05-13T18:44:27Z) - Hyperparameter Optimization: Foundations, Algorithms, Best Practices and
Open Challenges [5.139260825952818]
This paper reviews important HPO methods such as grid or random search, evolutionary algorithms, Bayesian optimization, Hyperband and racing.
It gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with ML pipelines, runtime improvements, and parallelization.
arXiv Detail & Related papers (2021-07-13T04:55:47Z) - Online hyperparameter optimization by real-time recurrent learning [57.01871583756586]
Our framework takes advantage of the analogy between hyperparameter optimization and parameter learning in neural networks (RNNs)
It adapts a well-studied family of online learning algorithms for RNNs to tune hyperparameters and network parameters simultaneously.
This procedure yields systematically better generalization performance compared to standard methods, at a fraction of wallclock time.
arXiv Detail & Related papers (2021-02-15T19:36:18Z) - Frugal Optimization for Cost-related Hyperparameters [43.599155206275306]
We develop a new cost-frugal HPO solution for machine learning algorithms.
We prove a convergence rate of $O(fracsqrtdsqrtK)$ and an $O(depsilon-2)$-approximation guarantee on the total cost.
We provide strong empirical results in comparison with state-of-the-art HPO methods on large AutoML benchmarks.
arXiv Detail & Related papers (2020-05-04T15:40:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.