Scalable Bayesian Optimization Using Vecchia Approximations of Gaussian
Processes
- URL: http://arxiv.org/abs/2203.01459v1
- Date: Wed, 2 Mar 2022 23:55:14 GMT
- Title: Scalable Bayesian Optimization Using Vecchia Approximations of Gaussian
Processes
- Authors: Felix Jimenez, Matthias Katzfuss
- Abstract summary: We adapt the Vecchia approximation, a popular GP approximation from spatial statistics, to enable scalable high-dimensional Bayesian optimization.
We focus on the use of our warped Vecchia GP in trust-region Bayesian optimization via Thompson sampling.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Bayesian optimization is a technique for optimizing black-box target
functions. At the core of Bayesian optimization is a surrogate model that
predicts the output of the target function at previously unseen inputs to
facilitate the selection of promising input values. Gaussian processes (GPs)
are commonly used as surrogate models but are known to scale poorly with the
number of observations. We adapt the Vecchia approximation, a popular GP
approximation from spatial statistics, to enable scalable high-dimensional
Bayesian optimization. We develop several improvements and extensions,
including training warped GPs using mini-batch gradient descent, approximate
neighbor search, and selecting multiple input values in parallel. We focus on
the use of our warped Vecchia GP in trust-region Bayesian optimization via
Thompson sampling. On several test functions and on two reinforcement-learning
problems, our methods compared favorably to the state of the art.
Related papers
- Optimizing Posterior Samples for Bayesian Optimization via Rootfinding [2.94944680995069]
We introduce an efficient global optimization strategy for posterior samples based on global rootfinding.
We demonstrate remarkable improvement in both inner- and outer-loop optimization.
We also propose a sample-average formulation of GP-TS, which has a parameter to explicitly control exploitation.
arXiv Detail & Related papers (2024-10-29T17:57:16Z) - Gaussian Process Thompson Sampling via Rootfinding [2.94944680995069]
Thompson sampling (TS) is a simple, effective policy in Bayesian decision making.
In continuous optimization, the posterior of the objective function is often a Gaussian process (GP), whose sample paths have numerous local optima.
We introduce an efficient global optimization strategy for GP-TS that carefully selects starting points for gradient-based multi-starts.
arXiv Detail & Related papers (2024-10-10T16:06:45Z) - Differentially Private Optimization with Sparse Gradients [60.853074897282625]
We study differentially private (DP) optimization problems under sparsity of individual gradients.
Building on this, we obtain pure- and approximate-DP algorithms with almost optimal rates for convex optimization with sparse gradients.
arXiv Detail & Related papers (2024-04-16T20:01:10Z) - Enhancing Gaussian Process Surrogates for Optimization and Posterior Approximation via Random Exploration [2.984929040246293]
novel noise-free Bayesian optimization strategies that rely on a random exploration step to enhance the accuracy of Gaussian process surrogate models.
New algorithms retain the ease of implementation of the classical GP-UCB, but an additional exploration step facilitates their convergence.
arXiv Detail & Related papers (2024-01-30T14:16:06Z) - Surrogate modeling for Bayesian optimization beyond a single Gaussian
process [62.294228304646516]
We propose a novel Bayesian surrogate model to balance exploration with exploitation of the search space.
To endow function sampling with scalability, random feature-based kernel approximation is leveraged per GP model.
To further establish convergence of the proposed EGP-TS to the global optimum, analysis is conducted based on the notion of Bayesian regret.
arXiv Detail & Related papers (2022-05-27T16:43:10Z) - Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction.
Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Global Optimization of Gaussian processes [52.77024349608834]
We propose a reduced-space formulation with trained Gaussian processes trained on few data points.
The approach also leads to significantly smaller and computationally cheaper sub solver for lower bounding.
In total, we reduce time convergence by orders of orders of the proposed method.
arXiv Detail & Related papers (2020-05-21T20:59:11Z) - Incorporating Expert Prior in Bayesian Optimisation via Space Warping [54.412024556499254]
In big search spaces the algorithm goes through several low function value regions before reaching the optimum of the function.
One approach to subside this cold start phase is to use prior knowledge that can accelerate the optimisation.
In this paper, we represent the prior knowledge about the function optimum through a prior distribution.
The prior distribution is then used to warp the search space in such a way that space gets expanded around the high probability region of function optimum and shrinks around low probability region of optimum.
arXiv Detail & Related papers (2020-03-27T06:18:49Z) - Sparse Gaussian Processes Revisited: Bayesian Approaches to
Inducing-Variable Approximations [27.43948386608]
Variational inference techniques based on inducing variables provide an elegant framework for scalable estimation in Gaussian process (GP) models.
In this work we challenge the common wisdom that optimizing the inducing inputs in variational framework yields optimal performance.
arXiv Detail & Related papers (2020-03-06T08:53:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.