Bayesian Optimization of Function Networks
- URL: http://arxiv.org/abs/2112.15311v1
- Date: Fri, 31 Dec 2021 05:35:21 GMT
- Title: Bayesian Optimization of Function Networks
- Authors: Raul Astudillo, Peter I. Frazier
- Abstract summary: We consider Bayesian optimization of the output of a network of functions, where each function takes as input the output of its parent nodes, and where the network takes significant time to evaluate.
Our approach delivers greater query efficiency by leveraging information that the former ignores: intermediate output within the network.
We show that our approach dramatically outperforms standard Bayesian optimization methods in several synthetic and real-world problems.
- Score: 20.73717187683924
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider Bayesian optimization of the output of a network of functions,
where each function takes as input the output of its parent nodes, and where
the network takes significant time to evaluate. Such problems arise, for
example, in reinforcement learning, engineering design, and manufacturing.
While the standard Bayesian optimization approach observes only the final
output, our approach delivers greater query efficiency by leveraging
information that the former ignores: intermediate output within the network.
This is achieved by modeling the nodes of the network using Gaussian processes
and choosing the points to evaluate using, as our acquisition function, the
expected improvement computed with respect to the implied posterior on the
objective. Although the non-Gaussian nature of this posterior prevents
computing our acquisition function in closed form, we show that it can be
efficiently maximized via sample average approximation. In addition, we prove
that our method is asymptotically consistent, meaning that it finds a globally
optimal solution as the number of evaluations grows to infinity, thus
generalizing previously known convergence results for the expected improvement.
Notably, this holds even though our method might not evaluate the domain
densely, instead leveraging problem structure to leave regions unexplored.
Finally, we show that our approach dramatically outperforms standard Bayesian
optimization methods in several synthetic and real-world problems.
Related papers
- Sample-efficient Bayesian Optimisation Using Known Invariances [56.34916328814857]
We show that vanilla and constrained BO algorithms are inefficient when optimising invariant objectives.
We derive a bound on the maximum information gain of these invariant kernels.
We use our method to design a current drive system for a nuclear fusion reactor, finding a high-performance solution.
arXiv Detail & Related papers (2024-10-22T12:51:46Z) - A Continuous Relaxation for Discrete Bayesian Optimization [17.312618575552]
We show that inference and optimization can be computationally tractable.
We consider in particular the optimization domain where very few observations and strict budgets exist.
We show that the resulting acquisition function can be optimized with both continuous or discrete optimization algorithms.
arXiv Detail & Related papers (2024-04-26T14:47:40Z) - Bayesian Optimization of Function Networks with Partial Evaluations [16.447868102676885]
We propose a novel knowledge gradient acquisition function that chooses which node and corresponding inputs to evaluate in a cost-aware manner.
Our acquisition function is the first to enable cost-aware optimization of a broad class of function networks.
arXiv Detail & Related papers (2023-11-03T17:55:08Z) - Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks.
We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z) - Generalizing Bayesian Optimization with Decision-theoretic Entropies [102.82152945324381]
We consider a generalization of Shannon entropy from work in statistical decision theory.
We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures.
We then show how alternative choices for the loss yield a flexible family of acquisition functions.
arXiv Detail & Related papers (2022-10-04T04:43:58Z) - Batch Bayesian Optimization via Particle Gradient Flows [0.5735035463793008]
We show how to find global optima of objective functions which are only available as a black-box or are expensive to evaluate.
We construct a new function based on multipoint expected probability which is over the space of probability measures.
arXiv Detail & Related papers (2022-09-10T18:10:15Z) - Outlier-Robust Sparse Estimation via Non-Convex Optimization [73.18654719887205]
We explore the connection between high-dimensional statistics and non-robust optimization in the presence of sparsity constraints.
We develop novel and simple optimization formulations for these problems.
As a corollary, we obtain that any first-order method that efficiently converges to station yields an efficient algorithm for these tasks.
arXiv Detail & Related papers (2021-09-23T17:38:24Z) - Global Optimization of Objective Functions Represented by ReLU Networks [77.55969359556032]
Neural networks can learn complex, non- adversarial functions, and it is challenging to guarantee their correct behavior in safety-critical contexts.
Many approaches exist to find failures in networks (e.g., adversarial examples), but these cannot guarantee the absence of failures.
We propose an approach that integrates the optimization process into the verification procedure, achieving better performance than the naive approach.
arXiv Detail & Related papers (2020-10-07T08:19:48Z) - Sequential Subspace Search for Functional Bayesian Optimization
Incorporating Experimenter Intuition [63.011641517977644]
Our algorithm generates a sequence of finite-dimensional random subspaces of functional space spanned by a set of draws from the experimenter's Gaussian Process.
Standard Bayesian optimisation is applied on each subspace, and the best solution found used as a starting point (origin) for the next subspace.
We test our algorithm in simulated and real-world experiments, namely blind function matching, finding the optimal precipitation-strengthening function for an aluminium alloy, and learning rate schedule optimisation for deep networks.
arXiv Detail & Related papers (2020-09-08T06:54:11Z) - Composition of kernel and acquisition functions for High Dimensional
Bayesian Optimization [0.1749935196721634]
We use the addition-ality of the objective function into mapping both the kernel and the acquisition function of the Bayesian Optimization.
This ap-proach makes more efficient the learning/updating of the probabilistic surrogate model.
Results are presented for real-life application, that is the control of pumps in urban water distribution systems.
arXiv Detail & Related papers (2020-03-09T15:45:57Z) - Uncertainty Quantification for Bayesian Optimization [12.433600693422235]
We propose a novel approach to assess the output uncertainty of Bayesian optimization algorithms, which proceeds by constructing confidence regions of the maximum point (or value) of the objective function.
Our theory provides a unified uncertainty quantification framework for all existing sequential sampling policies and stopping criteria.
arXiv Detail & Related papers (2020-02-04T22:48:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.