Approximation-Aware Bayesian Optimization
- URL: http://arxiv.org/abs/2406.04308v1
- Date: Thu, 6 Jun 2024 17:55:02 GMT
- Title: Approximation-Aware Bayesian Optimization
- Authors: Natalie Maus, Kyurae Kim, Geoff Pleiss, David Eriksson, John P. Cunningham, Jacob R. Gardner,
- Abstract summary: High-dimensional Bayesian optimization (BO) tasks often require 10,000 function evaluations before obtaining meaningful results.
We modify sparse variational Gaussian processes (SVGPs) to better align with the goals of BO.
Using the framework of utility-calibrated variational inference, we unify GP approximation and data acquisition into a joint optimization problem.
- Score: 34.56666383247348
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: High-dimensional Bayesian optimization (BO) tasks such as molecular design often require 10,000 function evaluations before obtaining meaningful results. While methods like sparse variational Gaussian processes (SVGPs) reduce computational requirements in these settings, the underlying approximations result in suboptimal data acquisitions that slow the progress of optimization. In this paper we modify SVGPs to better align with the goals of BO: targeting informed data acquisition rather than global posterior fidelity. Using the framework of utility-calibrated variational inference, we unify GP approximation and data acquisition into a joint optimization problem, thereby ensuring optimal decisions under a limited computational budget. Our approach can be used with any decision-theoretic acquisition function and is compatible with trust region methods like TuRBO. We derive efficient joint objectives for the expected improvement and knowledge gradient acquisition functions in both the standard and batch BO settings. Our approach outperforms standard SVGPs on high-dimensional benchmark tasks in control and molecular design.
Related papers
- Decoding-Time Language Model Alignment with Multiple Objectives [88.64776769490732]
Existing methods primarily focus on optimizing LMs for a single reward function, limiting their adaptability to varied objectives.
Here, we propose $textbfmulti-objective decoding (MOD)$, a decoding-time algorithm that outputs the next token from a linear combination of predictions.
We show why existing approaches can be sub-optimal even in natural settings and obtain optimality guarantees for our method.
arXiv Detail & Related papers (2024-06-27T02:46:30Z) - Adaptive Preference Scaling for Reinforcement Learning with Human Feedback [103.36048042664768]
Reinforcement learning from human feedback (RLHF) is a prevalent approach to align AI systems with human values.
We propose a novel adaptive preference loss, underpinned by distributionally robust optimization (DRO)
Our method is versatile and can be readily adapted to various preference optimization frameworks.
arXiv Detail & Related papers (2024-06-04T20:33:22Z) - PG-LBO: Enhancing High-Dimensional Bayesian Optimization with
Pseudo-Label and Gaussian Process Guidance [31.585328335396607]
Current mainstream methods overlook the potential of utilizing a pool of unlabeled data to construct the latent space.
We propose a novel method to effectively utilize unlabeled data with the guidance of labeled data.
Our proposed method outperforms existing VAE-BO algorithms in various optimization scenarios.
arXiv Detail & Related papers (2023-12-28T11:57:58Z) - Learning Regions of Interest for Bayesian Optimization with Adaptive
Level-Set Estimation [84.0621253654014]
We propose a framework, called BALLET, which adaptively filters for a high-confidence region of interest.
We show theoretically that BALLET can efficiently shrink the search space, and can exhibit a tighter regret bound than standard BO.
arXiv Detail & Related papers (2023-07-25T09:45:47Z) - Generalizing Bayesian Optimization with Decision-theoretic Entropies [102.82152945324381]
We consider a generalization of Shannon entropy from work in statistical decision theory.
We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures.
We then show how alternative choices for the loss yield a flexible family of acquisition functions.
arXiv Detail & Related papers (2022-10-04T04:43:58Z) - Pre-training helps Bayesian optimization too [49.28382118032923]
We seek an alternative practice for setting functional priors.
In particular, we consider the scenario where we have data from similar functions that allow us to pre-train a tighter distribution a priori.
Our results show that our method is able to locate good hyper parameters at least 3 times more efficiently than the best competing methods.
arXiv Detail & Related papers (2022-07-07T04:42:54Z) - High Dimensional Bayesian Optimization with Kernel Principal Component
Analysis [4.33419118449588]
kernel PCA-assisted BO (KPCA-BO) algorithm embeds a non-linear sub-manifold in the search space and performs BO on this sub-manifold.
We compare the performance of KPCA-BO to the vanilla BO and PCA-BO on the multi-modal problems of the COCO/BBOB benchmark suite.
arXiv Detail & Related papers (2022-04-28T20:09:02Z) - Automatic prior selection for meta Bayesian optimization with a case
study on tuning deep neural network optimizers [47.013395100497775]
We propose a principled approach to solve such expensive hyperparameter tuning problems efficiently.
Key to the performance of BO is specifying and refining a distribution over functions, which is used to reason about the optima of the underlying function being optimized.
We verify our approach in realistic model training setups by training tens of thousands of configurations of near-state-of-the-art models on popular image and text datasets.
arXiv Detail & Related papers (2021-09-16T20:46:26Z) - Multi-Fidelity Bayesian Optimization via Deep Neural Networks [19.699020509495437]
In many applications, the objective function can be evaluated at multiple fidelities to enable a trade-off between the cost and accuracy.
We propose Deep Neural Network Multi-Fidelity Bayesian Optimization (DNN-MFBO) that can flexibly capture all kinds of complicated relationships between the fidelities.
We show the advantages of our method in both synthetic benchmark datasets and real-world applications in engineering design.
arXiv Detail & Related papers (2020-07-06T23:28:40Z) - Global Optimization of Gaussian processes [52.77024349608834]
We propose a reduced-space formulation with trained Gaussian processes trained on few data points.
The approach also leads to significantly smaller and computationally cheaper sub solver for lower bounding.
In total, we reduce time convergence by orders of orders of the proposed method.
arXiv Detail & Related papers (2020-05-21T20:59:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.