Generalizing Bayesian Optimization with Decision-theoretic Entropies
- URL: http://arxiv.org/abs/2210.01383v1
- Date: Tue, 4 Oct 2022 04:43:58 GMT
- Title: Generalizing Bayesian Optimization with Decision-theoretic Entropies
- Authors: Willie Neiswanger, Lantao Yu, Shengjia Zhao, Chenlin Meng, Stefano
Ermon
- Abstract summary: We consider a generalization of Shannon entropy from work in statistical decision theory.
We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures.
We then show how alternative choices for the loss yield a flexible family of acquisition functions.
- Score: 102.82152945324381
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bayesian optimization (BO) is a popular method for efficiently inferring
optima of an expensive black-box function via a sequence of queries. Existing
information-theoretic BO procedures aim to make queries that most reduce the
uncertainty about optima, where the uncertainty is captured by Shannon entropy.
However, an optimal measure of uncertainty would, ideally, factor in how we
intend to use the inferred quantity in some downstream procedure. In this
paper, we instead consider a generalization of Shannon entropy from work in
statistical decision theory (DeGroot 1962, Rao 1984), which contains a broad
class of uncertainty measures parameterized by a problem-specific loss function
corresponding to a downstream task. We first show that special cases of this
entropy lead to popular acquisition functions used in BO procedures such as
knowledge gradient, expected improvement, and entropy search. We then show how
alternative choices for the loss yield a flexible family of acquisition
functions that can be customized for use in novel optimization settings.
Additionally, we develop gradient-based methods to efficiently optimize our
proposed family of acquisition functions, and demonstrate strong empirical
performance on a diverse set of sequential decision making tasks, including
variants of top-$k$ optimization, multi-level set estimation, and sequence
search.
Related papers
- Sample-efficient Bayesian Optimisation Using Known Invariances [56.34916328814857]
We show that vanilla and constrained BO algorithms are inefficient when optimising invariant objectives.
We derive a bound on the maximum information gain of these invariant kernels.
We use our method to design a current drive system for a nuclear fusion reactor, finding a high-performance solution.
arXiv Detail & Related papers (2024-10-22T12:51:46Z) - BO4IO: A Bayesian optimization approach to inverse optimization with uncertainty quantification [5.031974232392534]
This work addresses data-driven inverse optimization (IO)
The goal is to estimate unknown parameters in an optimization model from observed decisions that can be assumed to be optimal or near-optimal.
arXiv Detail & Related papers (2024-05-28T06:52:17Z) - End-to-End Learning for Fair Multiobjective Optimization Under
Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality.
This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives.
It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z) - Principled Preferential Bayesian Optimization [22.269732173306192]
We study the problem of preferential Bayesian optimization (BO)
We aim to optimize a black-box function with only preference feedback over a pair of candidate solutions.
An optimistic algorithm with an efficient computational method is then developed to solve the problem.
arXiv Detail & Related papers (2024-02-08T02:57:47Z) - Poisson Process for Bayesian Optimization [126.51200593377739]
We propose a ranking-based surrogate model based on the Poisson process and introduce an efficient BO framework, namely Poisson Process Bayesian Optimization (PoPBO)
Compared to the classic GP-BO method, our PoPBO has lower costs and better robustness to noise, which is verified by abundant experiments.
arXiv Detail & Related papers (2024-02-05T02:54:50Z) - Bayesian Optimization with Conformal Prediction Sets [44.565812181545645]
Conformal prediction is an uncertainty quantification method with coverage guarantees even for misspecified models.
We propose conformal Bayesian optimization, which directs queries towards regions of search space where the model predictions have guaranteed validity.
In many cases we find that query coverage can be significantly improved without harming sample-efficiency.
arXiv Detail & Related papers (2022-10-22T17:01:05Z) - Trusted-Maximizers Entropy Search for Efficient Bayesian Optimization [39.824086260578646]
This paper presents a novel trusted-maximizers entropy search (TES) acquisition function.
It measures how much an input contributes to the information gain on a query over a finite set of trusted maximizers.
arXiv Detail & Related papers (2021-07-30T07:25:07Z) - Bayesian Algorithm Execution: Estimating Computable Properties of
Black-box Functions Using Mutual Information [78.78486761923855]
In many real world problems, we want to infer some property of an expensive black-box function f, given a budget of T function evaluations.
We present a procedure, InfoBAX, that sequentially chooses queries that maximize mutual information with respect to the algorithm's output.
On these problems, InfoBAX uses up to 500 times fewer queries to f than required by the original algorithm.
arXiv Detail & Related papers (2021-04-19T17:22:11Z) - Incorporating Expert Prior in Bayesian Optimisation via Space Warping [54.412024556499254]
In big search spaces the algorithm goes through several low function value regions before reaching the optimum of the function.
One approach to subside this cold start phase is to use prior knowledge that can accelerate the optimisation.
In this paper, we represent the prior knowledge about the function optimum through a prior distribution.
The prior distribution is then used to warp the search space in such a way that space gets expanded around the high probability region of function optimum and shrinks around low probability region of optimum.
arXiv Detail & Related papers (2020-03-27T06:18:49Z) - Composition of kernel and acquisition functions for High Dimensional
Bayesian Optimization [0.1749935196721634]
We use the addition-ality of the objective function into mapping both the kernel and the acquisition function of the Bayesian Optimization.
This ap-proach makes more efficient the learning/updating of the probabilistic surrogate model.
Results are presented for real-life application, that is the control of pumps in urban water distribution systems.
arXiv Detail & Related papers (2020-03-09T15:45:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.