Related papers: Cost-Efficient Online Hyperparameter Optimization

Cost-Efficient Online Hyperparameter Optimization

URL: http://arxiv.org/abs/2101.06590v1
Date: Sun, 17 Jan 2021 04:55:30 GMT
Title: Cost-Efficient Online Hyperparameter Optimization
Authors: Jingkang Wang, Mengye Ren, Ilija Bogunovic, Yuwen Xiong, Raquel Urtasun
Abstract summary: We propose an online HPO algorithm that reaches human expert-level performance within a single run of the experiment. Our proposed online HPO algorithm reaches human expert-level performance within a single run of the experiment, while incurring only modest computational overhead compared to regular training.
Score: 94.60924644778558
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent work on hyperparameters optimization (HPO) has shown the possibility of training certain hyperparameters together with regular parameters. However, these online HPO algorithms still require running evaluation on a set of validation examples at each training step, steeply increasing the training cost. To decide when to query the validation loss, we model online HPO as a time-varying Bayesian optimization problem, on top of which we propose a novel \textit{costly feedback} setting to capture the concept of the query cost. Under this setting, standard algorithms are cost-inefficient as they evaluate on the validation set at every round. In contrast, the cost-efficient GP-UCB algorithm proposed in this paper queries the unknown function only when the model is less confident about current decisions. We evaluate our proposed algorithm by tuning hyperparameters online for VGG and ResNet on CIFAR-10 and ImageNet100. Our proposed online HPO algorithm reaches human expert-level performance within a single run of the experiment, while incurring only modest computational overhead compared to regular training.

Related papers

ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning [50.53705050673944]
We propose ULTHO, an ultra-lightweight yet powerful framework for fast HPO in deep RL within single runs. Specifically, we formulate the HPO process as a multi-armed bandit with clustered arms (MABC) and link it directly to long-term return optimization. We test ULTHO on benchmarks including ALE, Procgen, MiniGrid, and PyBullet.
arXiv Detail & Related papers (2025-03-08T07:03:43Z)
PriorBand: Practical Hyperparameter Optimization in the Age of Deep Learning [49.92394599459274]
We propose PriorBand, an HPO algorithm tailored to Deep Learning (DL) pipelines. We show its robustness across a range of DL benchmarks and show its gains under informative expert input and against poor expert beliefs.
arXiv Detail & Related papers (2023-06-21T16:26:14Z)
Online Learning and Optimization for Queues with Unknown Demand Curve and Service Distribution [26.720986177499338]
We investigate an optimization problem in a queueing system where the service provider selects the optimal service fee p and service capacity mu. We develop an online learning framework that automatically incorporates the parameter estimation errors in the solution prescription process.
arXiv Detail & Related papers (2023-03-06T08:47:40Z)
A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization [57.450449884166346]
We propose an adaptive HPO method to account for the privacy cost of HPO. We obtain state-of-the-art performance on 22 benchmark tasks, across computer vision and natural language processing, across pretraining and finetuning.
arXiv Detail & Related papers (2022-12-08T18:56:37Z)
Enhancing Explainability of Hyperparameter Optimization via Bayesian Algorithm Execution [13.037647287689438]
We study the combination of HPO with interpretable machine learning (IML) methods such as partial dependence plots. We propose a modified HPO method which efficiently searches for optimum global predictive performance. Our method returns more reliable explanations of the underlying black-box without a loss of optimization performance.
arXiv Detail & Related papers (2022-06-11T07:12:04Z)
Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction. Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z)
Optimal Parameter-free Online Learning with Switching Cost [47.415099037249085]
-freeness in online learning refers to the adaptivity of an algorithm with respect to the optimal decision in hindsight. In this paper, we design such algorithms in the presence of switching cost - the latter penalizes the optimistic updates required by parameter-freeness. We propose a simple yet powerful algorithm for Online Linear Optimization (OLO) with switching cost, which improves the existing suboptimal regret bound [ZCP22a] to the optimal rate.
arXiv Detail & Related papers (2022-05-13T18:44:27Z)
Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges [5.139260825952818]
This paper reviews important HPO methods such as grid or random search, evolutionary algorithms, Bayesian optimization, Hyperband and racing. It gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with ML pipelines, runtime improvements, and parallelization.
arXiv Detail & Related papers (2021-07-13T04:55:47Z)
Online hyperparameter optimization by real-time recurrent learning [57.01871583756586]
Our framework takes advantage of the analogy between hyperparameter optimization and parameter learning in neural networks (RNNs) It adapts a well-studied family of online learning algorithms for RNNs to tune hyperparameters and network parameters simultaneously. This procedure yields systematically better generalization performance compared to standard methods, at a fraction of wallclock time.
arXiv Detail & Related papers (2021-02-15T19:36:18Z)
Frugal Optimization for Cost-related Hyperparameters [43.599155206275306]
We develop a new cost-frugal HPO solution for machine learning algorithms. We prove a convergence rate of $O(fracsqrtdsqrtK)$ and an $O(depsilon-2)$-approximation guarantee on the total cost. We provide strong empirical results in comparison with state-of-the-art HPO methods on large AutoML benchmarks.
arXiv Detail & Related papers (2020-05-04T15:40:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.