Related papers: Incorporating Expert Prior in Bayesian Optimisation via Space Warping

Incorporating Expert Prior in Bayesian Optimisation via Space Warping

URL: http://arxiv.org/abs/2003.12250v1
Date: Fri, 27 Mar 2020 06:18:49 GMT
Title: Incorporating Expert Prior in Bayesian Optimisation via Space Warping
Authors: Anil Ramachandran, Sunil Gupta, Santu Rana, Cheng Li, Svetha Venkatesh
Abstract summary: In big search spaces the algorithm goes through several low function value regions before reaching the optimum of the function. One approach to subside this cold start phase is to use prior knowledge that can accelerate the optimisation. In this paper, we represent the prior knowledge about the function optimum through a prior distribution. The prior distribution is then used to warp the search space in such a way that space gets expanded around the high probability region of function optimum and shrinks around low probability region of optimum.
Score: 54.412024556499254
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Bayesian optimisation is a well-known sample-efficient method for the optimisation of expensive black-box functions. However when dealing with big search spaces the algorithm goes through several low function value regions before reaching the optimum of the function. Since the function evaluations are expensive in terms of both money and time, it may be desirable to alleviate this problem. One approach to subside this cold start phase is to use prior knowledge that can accelerate the optimisation. In its standard form, Bayesian optimisation assumes the likelihood of any point in the search space being the optimum is equal. Therefore any prior knowledge that can provide information about the optimum of the function would elevate the optimisation performance. In this paper, we represent the prior knowledge about the function optimum through a prior distribution. The prior distribution is then used to warp the search space in such a way that space gets expanded around the high probability region of function optimum and shrinks around low probability region of optimum. We incorporate this prior directly in function model (Gaussian process), by redefining the kernel matrix, which allows this method to work with any acquisition function, i.e. acquisition agnostic approach. We show the superiority of our method over standard Bayesian optimisation method through optimisation of several benchmark functions and hyperparameter tuning of two algorithms: Support Vector Machine (SVM) and Random forest.

Related papers

A Continuous Relaxation for Discrete Bayesian Optimization [17.312618575552]
We show that inference and optimization can be computationally tractable. We consider in particular the optimization domain where very few observations and strict budgets exist. We show that the resulting acquisition function can be optimized with both continuous or discrete optimization algorithms.
arXiv Detail & Related papers (2024-04-26T14:47:40Z)
Generalizing Bayesian Optimization with Decision-theoretic Entropies [102.82152945324381]
We consider a generalization of Shannon entropy from work in statistical decision theory. We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures. We then show how alternative choices for the loss yield a flexible family of acquisition functions.
arXiv Detail & Related papers (2022-10-04T04:43:58Z)
Batch Bayesian Optimization via Particle Gradient Flows [0.5735035463793008]
We show how to find global optima of objective functions which are only available as a black-box or are expensive to evaluate. We construct a new function based on multipoint expected probability which is over the space of probability measures.
arXiv Detail & Related papers (2022-09-10T18:10:15Z)
Optimistic Optimization of Gaussian Process Samples [30.226274682578172]
A competing, computationally more efficient, global optimization framework is optimistic optimization, which exploits prior knowledge about the geometry of the search space in form of a dissimilarity function. We argue that there is a new research domain between geometric and probabilistic search, i.e. methods that run drastically faster than traditional Bayesian optimization, while retaining some of the crucial functionality of Bayesian optimization.
arXiv Detail & Related papers (2022-09-02T09:06:24Z)
Bayesian Optimization for Min Max Optimization [77.60508571062958]
We propose algorithms that perform Min Max Optimization in a setting where the function that should be optimized is not known a priori. We extend the two acquisition functions Expected Improvement and Gaussian Process Upper Confidence Bound. We show that these acquisition functions allow for better solutions - converging faster to the optimum than the benchmark settings.
arXiv Detail & Related papers (2021-07-29T06:49:34Z)
Bayesian Optimistic Optimisation with Exponentially Decaying Regret [58.02542541410322]
The current practical BO algorithms have regret bounds ranging from $mathcalO(fraclogNsqrtN)$ to $mathcal O(e-sqrtN)$, where $N$ is the number of evaluations. This paper explores the possibility of improving the regret bound in the noiseless setting by intertwining concepts from BO and tree-based optimistic optimisation. We propose the BOO algorithm, a first practical approach which can achieve an exponential regret bound with order $mathcal O(N-sqrt
arXiv Detail & Related papers (2021-05-10T13:07:44Z)
Bayesian Algorithm Execution: Estimating Computable Properties of Black-box Functions Using Mutual Information [78.78486761923855]
In many real world problems, we want to infer some property of an expensive black-box function f, given a budget of T function evaluations. We present a procedure, InfoBAX, that sequentially chooses queries that maximize mutual information with respect to the algorithm's output. On these problems, InfoBAX uses up to 500 times fewer queries to f than required by the original algorithm.
arXiv Detail & Related papers (2021-04-19T17:22:11Z)
Combinatorial Bayesian Optimization with Random Mapping Functions to Convex Polytopes [43.19936635161588]
We present a method for Bayesian optimization in a space which can operate well in a large space. Our algorithm shows satisfactory performance compared to existing methods.
arXiv Detail & Related papers (2020-11-26T02:22:41Z)
Sequential Subspace Search for Functional Bayesian Optimization Incorporating Experimenter Intuition [63.011641517977644]
Our algorithm generates a sequence of finite-dimensional random subspaces of functional space spanned by a set of draws from the experimenter's Gaussian Process. Standard Bayesian optimisation is applied on each subspace, and the best solution found used as a starting point (origin) for the next subspace. We test our algorithm in simulated and real-world experiments, namely blind function matching, finding the optimal precipitation-strengthening function for an aluminium alloy, and learning rate schedule optimisation for deep networks.
arXiv Detail & Related papers (2020-09-08T06:54:11Z)
Composition of kernel and acquisition functions for High Dimensional Bayesian Optimization [0.1749935196721634]
We use the addition-ality of the objective function into mapping both the kernel and the acquisition function of the Bayesian Optimization. This ap-proach makes more efficient the learning/updating of the probabilistic surrogate model. Results are presented for real-life application, that is the control of pumps in urban water distribution systems.
arXiv Detail & Related papers (2020-03-09T15:45:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.