A Lipschitz Bandits Approach for Continuous Hyperparameter Optimization
- URL: http://arxiv.org/abs/2302.01539v3
- Date: Thu, 8 Jun 2023 15:05:18 GMT
- Title: A Lipschitz Bandits Approach for Continuous Hyperparameter Optimization
- Authors: Yasong Feng, Weijian Luo, Yimin Huang, Tianyu Wang
- Abstract summary: BLiE is a Lipschitz-bandit-based HPO algorithm that only assumes Lipschitz continuity of the objective function.
Empirically, we demonstrate that BLiE outperforms the state-of-the-art HPO algorithms on benchmark tasks.
- Score: 6.572589601317779
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the most critical problems in machine learning is HyperParameter
Optimization (HPO), since choice of hyperparameters has a significant impact on
final model performance. Although there are many HPO algorithms, they either
have no theoretical guarantees or require strong assumptions. To this end, we
introduce BLiE -- a Lipschitz-bandit-based algorithm for HPO that only assumes
Lipschitz continuity of the objective function. BLiE exploits the landscape of
the objective function to adaptively search over the hyperparameter space.
Theoretically, we show that $(i)$ BLiE finds an $\epsilon$-optimal
hyperparameter with $\mathcal{O} \left( \epsilon^{-(d_z + \beta)}\right)$ total
budgets, where $d_z$ and $\beta$ are problem intrinsic; $(ii)$ BLiE is highly
parallelizable. Empirically, we demonstrate that BLiE outperforms the
state-of-the-art HPO algorithms on benchmark tasks. We also apply BLiE to
search for noise schedule of diffusion models. Comparison with the default
schedule shows that BLiE schedule greatly improves the sampling speed.
Related papers
- Cost-Sensitive Freeze-thaw Bayesian Optimization for Efficient Hyperparameter Tuning [51.6191275658441]
We introduce emphutility in the freeze-thaw framework, a function describing the trade-off between the cost and performance.<n>We validate our algorithm on established multi-fidelity HPO benchmarks and show that it outperforms all the previous freeze-thaw BO and transfer-BO baselines.
arXiv Detail & Related papers (2025-10-24T12:03:57Z) - Near-Optimal Parameter Tuning of Level-1 QAOA for Ising Models [3.390330377512402]
We show how to reduce the two-dimensional $(gamma, beta)$ search to a one-dimensional search over $gamma$, with $beta*$ computed analytically.
This approach is validated using Recursive QAOA (RQAOA), where it consistently outperforms both coarsely optimised RQAOA and semidefinite programs.
arXiv Detail & Related papers (2025-01-27T19:00:00Z) - PriorBand: Practical Hyperparameter Optimization in the Age of Deep
Learning [49.92394599459274]
We propose PriorBand, an HPO algorithm tailored to Deep Learning (DL) pipelines.
We show its robustness across a range of DL benchmarks and show its gains under informative expert input and against poor expert beliefs.
arXiv Detail & Related papers (2023-06-21T16:26:14Z) - Iterative Deepening Hyperband [8.257520009686239]
We show that incremental variants of Hyperband satisfy theoretical guarantees qualitatively similar to those for the original Hyperband with the "right" budget.
We demonstrate their practical utility in experiments with benchmark data sets.
arXiv Detail & Related papers (2023-02-01T15:33:51Z) - A Comparative study of Hyper-Parameter Optimization Tools [2.6097538974670935]
We compare the performance of four python libraries, namely Optuna, Hyperopt, Optunity, and sequential model algorithm configuration (SMAC)
We found that Optuna has better performance for CASH problem and NeurIPS black-box optimization challenge.
arXiv Detail & Related papers (2022-01-17T14:49:36Z) - Reducing the Variance of Gaussian Process Hyperparameter Optimization
with Preconditioning [54.01682318834995]
Preconditioning is a highly effective step for any iterative method involving matrix-vector multiplication.
We prove that preconditioning has an additional benefit that has been previously unexplored.
It simultaneously can reduce variance at essentially negligible cost.
arXiv Detail & Related papers (2021-07-01T06:43:11Z) - Bayesian Optimistic Optimisation with Exponentially Decaying Regret [58.02542541410322]
The current practical BO algorithms have regret bounds ranging from $mathcalO(fraclogNsqrtN)$ to $mathcal O(e-sqrtN)$, where $N$ is the number of evaluations.
This paper explores the possibility of improving the regret bound in the noiseless setting by intertwining concepts from BO and tree-based optimistic optimisation.
We propose the BOO algorithm, a first practical approach which can achieve an exponential regret bound with order $mathcal O(N-sqrt
arXiv Detail & Related papers (2021-05-10T13:07:44Z) - How Bayesian Should Bayesian Optimisation Be? [0.024790788944106048]
We investigate whether a fully-Bayesian treatment of the Gaussian process hyperparameters in BO (FBBO) leads to improved optimisation performance.
We compare FBBO using three approximate inference schemes to the maximum likelihood approach, using the Expected Improvement (EI) and Upper Confidence Bound (UCB) acquisition functions.
We find that FBBO using EI with an ARD kernel leads to the best performance in the noise-free setting, with much less difference between combinations of BO components when the noise is increased.
arXiv Detail & Related papers (2021-05-03T14:28:11Z) - A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm
for Bilevel Optimization [112.59170319105971]
We propose a new algorithm -- the Momentum- Single-timescale Approximation (MSTSA) -- for tackling problems.
MSTSA allows us to control the error in iterations due to inaccurate solution to the lower level subproblem.
arXiv Detail & Related papers (2021-02-15T07:10:33Z) - Cost-Efficient Online Hyperparameter Optimization [94.60924644778558]
We propose an online HPO algorithm that reaches human expert-level performance within a single run of the experiment.
Our proposed online HPO algorithm reaches human expert-level performance within a single run of the experiment, while incurring only modest computational overhead compared to regular training.
arXiv Detail & Related papers (2021-01-17T04:55:30Z) - Automatic Setting of DNN Hyper-Parameters by Mixing Bayesian
Optimization and Tuning Rules [0.6875312133832078]
We build a new algorithm for evaluating and analyzing the results of the network on the training and validation sets.
We use a set of tuning rules to add new hyper-parameters and/or to reduce the hyper- parameter search space to select a better combination.
arXiv Detail & Related papers (2020-06-03T08:53:48Z) - HyperSTAR: Task-Aware Hyperparameters for Deep Networks [52.50861379908611]
HyperSTAR is a task-aware method to warm-start HPO for deep neural networks.
It learns a dataset (task) representation along with the performance predictor directly from raw images.
It evaluates 50% less configurations to achieve the best performance compared to existing methods.
arXiv Detail & Related papers (2020-05-21T08:56:50Z) - Frugal Optimization for Cost-related Hyperparameters [43.599155206275306]
We develop a new cost-frugal HPO solution for machine learning algorithms.
We prove a convergence rate of $O(fracsqrtdsqrtK)$ and an $O(depsilon-2)$-approximation guarantee on the total cost.
We provide strong empirical results in comparison with state-of-the-art HPO methods on large AutoML benchmarks.
arXiv Detail & Related papers (2020-05-04T15:40:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.