MONGOOSE: Path-wise Smooth Bayesian Optimisation via Meta-learning
- URL: http://arxiv.org/abs/2302.11533v2
- Date: Wed, 3 Jul 2024 00:28:00 GMT
- Title: MONGOOSE: Path-wise Smooth Bayesian Optimisation via Meta-learning
- Authors: Adam X. Yang, Laurence Aitchison, Henry B. Moss,
- Abstract summary: A primary contributor to the cost of evaluating black-box objective functions is often the effort required to prepare the system for measurement.
We consider a common scenario where preparation costs grow as the distance between successive evaluations increases.
Our algorithm, MONGOOSE, uses a meta-learnt parametric policy to generate smooth optimisation trajectories.
- Score: 29.97648417539237
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In Bayesian optimisation, we often seek to minimise the black-box objective functions that arise in real-world physical systems. A primary contributor to the cost of evaluating such black-box objective functions is often the effort required to prepare the system for measurement. We consider a common scenario where preparation costs grow as the distance between successive evaluations increases. In this setting, smooth optimisation trajectories are preferred and the jumpy paths produced by the standard myopic (i.e.\ one-step-optimal) Bayesian optimisation methods are sub-optimal. Our algorithm, MONGOOSE, uses a meta-learnt parametric policy to generate smooth optimisation trajectories, achieving performance gains over existing methods when optimising functions with large movement costs.
Related papers
- Understanding Optimization in Deep Learning with Central Flows [53.66160508990508]
We show that an RMS's implicit behavior can be explicitly captured by a "central flow:" a differential equation.
We show that these flows can empirically predict long-term optimization trajectories of generic neural networks.
arXiv Detail & Related papers (2024-10-31T17:58:13Z) - Cost-aware Bayesian Optimization via the Pandora's Box Gittins Index [57.045952766988925]
We develop a previously-unexplored connection between cost-aware Bayesian optimization and the Pandora's Box problem, a decision problem from economics.
Our work constitutes a first step towards integrating techniques from Gittins index theory into Bayesian optimization.
arXiv Detail & Related papers (2024-06-28T17:20:13Z) - Deterministic Langevin Unconstrained Optimization with Normalizing Flows [3.988614978933934]
We introduce a global, free surrogate optimization strategy for black-box functions inspired by the Fokker-Planck and Langevin equations.
We demonstrate superior competitive progress toward objective optima on standard synthetic test functions.
arXiv Detail & Related papers (2023-10-01T17:46:20Z) - Generalizing Bayesian Optimization with Decision-theoretic Entropies [102.82152945324381]
We consider a generalization of Shannon entropy from work in statistical decision theory.
We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures.
We then show how alternative choices for the loss yield a flexible family of acquisition functions.
arXiv Detail & Related papers (2022-10-04T04:43:58Z) - Batch Bayesian Optimization via Particle Gradient Flows [0.5735035463793008]
We show how to find global optima of objective functions which are only available as a black-box or are expensive to evaluate.
We construct a new function based on multipoint expected probability which is over the space of probability measures.
arXiv Detail & Related papers (2022-09-10T18:10:15Z) - Optimistic Optimization of Gaussian Process Samples [30.226274682578172]
A competing, computationally more efficient, global optimization framework is optimistic optimization, which exploits prior knowledge about the geometry of the search space in form of a dissimilarity function.
We argue that there is a new research domain between geometric and probabilistic search, i.e. methods that run drastically faster than traditional Bayesian optimization, while retaining some of the crucial functionality of Bayesian optimization.
arXiv Detail & Related papers (2022-09-02T09:06:24Z) - Sparse Bayesian Optimization [16.867375370457438]
We present several regularization-based approaches that allow us to discover sparse and more interpretable configurations.
We propose a novel differentiable relaxation based on homotopy continuation that makes it possible to target sparsity.
We show that we are able to efficiently optimize for sparsity.
arXiv Detail & Related papers (2022-03-03T18:25:33Z) - Are we Forgetting about Compositional Optimisers in Bayesian
Optimisation? [66.39551991177542]
This paper presents a sample methodology for global optimisation.
Within this, a crucial performance-determiningtrivial is maximising the acquisition function.
We highlight the empirical advantages of the approach to optimise functionation across 3958 individual experiments.
arXiv Detail & Related papers (2020-12-15T12:18:38Z) - Self-Tuning Stochastic Optimization with Curvature-Aware Gradient
Filtering [53.523517926927894]
We explore the use of exact per-sample Hessian-vector products and gradients to construct self-tuning quadratics.
We prove that our model-based procedure converges in noisy gradient setting.
This is an interesting step for constructing self-tuning quadratics.
arXiv Detail & Related papers (2020-11-09T22:07:30Z) - BOSH: Bayesian Optimization by Sampling Hierarchically [10.10241176664951]
We propose a novel BO routine pairing a hierarchical Gaussian process with an information-theoretic framework to generate a growing pool of realizations.
We demonstrate that BOSH provides more efficient and higher-precision optimization than standard BO across synthetic benchmarks, simulation optimization, reinforcement learning and hyper- parameter tuning tasks.
arXiv Detail & Related papers (2020-07-02T07:35:49Z) - Incorporating Expert Prior in Bayesian Optimisation via Space Warping [54.412024556499254]
In big search spaces the algorithm goes through several low function value regions before reaching the optimum of the function.
One approach to subside this cold start phase is to use prior knowledge that can accelerate the optimisation.
In this paper, we represent the prior knowledge about the function optimum through a prior distribution.
The prior distribution is then used to warp the search space in such a way that space gets expanded around the high probability region of function optimum and shrinks around low probability region of optimum.
arXiv Detail & Related papers (2020-03-27T06:18:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.