Practical First-Order Bayesian Optimization Algorithms
- URL: http://arxiv.org/abs/2306.10815v1
- Date: Mon, 19 Jun 2023 10:05:41 GMT
- Title: Practical First-Order Bayesian Optimization Algorithms
- Authors: Utkarsh Prakash, Aryan Chollera, Kushagra Khatwani, Prabuchandran K.
J. and Tejas Bodas
- Abstract summary: We propose a class of practical FOBO algorithms that efficiently utilize the information from the gradient GP to identify potential query points with zero gradients.
We validate the performance of our proposed algorithms on several test functions and show that our algorithms outperform state-of-the-art FOBO algorithms.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: First Order Bayesian Optimization (FOBO) is a sample efficient sequential
approach to find the global maxima of an expensive-to-evaluate black-box
objective function by suitably querying for the function and its gradient
evaluations. Such methods assume Gaussian process (GP) models for both, the
function and its gradient, and use them to construct an acquisition function
that identifies the next query point. In this paper, we propose a class of
practical FOBO algorithms that efficiently utilizes the information from the
gradient GP to identify potential query points with zero gradients. We
construct a multi-level acquisition function where in the first step, we
optimize a lower level acquisition function with multiple restarts to identify
potential query points with zero gradient value. We then use the upper level
acquisition function to rank these query points based on their function values
to potentially identify the global maxima. As a final step, the potential point
of maxima is chosen as the actual query point. We validate the performance of
our proposed algorithms on several test functions and show that our algorithms
outperform state-of-the-art FOBO algorithms. We also illustrate the application
of our algorithms in finding optimal set of hyper-parameters in machine
learning and in learning the optimal policy in reinforcement learning tasks.
Related papers
- Experience in Engineering Complex Systems: Active Preference Learning
with Multiple Outcomes and Certainty Levels [1.5257326975704795]
Black-box optimization refers to the problem whose objective function and/or constraint sets are either unknown, inaccessible, or non-existent.
The algorithm so-called Active Preference Learning has been developed to exploit this specific information.
Our approach aims to extend the algorithm in such a way that can exploit further information effectively.
arXiv Detail & Related papers (2023-02-27T15:55:37Z) - Ensemble-based gradient inference for particle methods in optimization
and sampling [2.9005223064604078]
We propose an approach based on function evaluations and Bayesian inference to extract higher-order differential information.
We suggest to use this information for the improvement of established ensemble-based numerical methods for optimization and sampling.
arXiv Detail & Related papers (2022-09-23T09:21:35Z) - Optimizing Bayesian acquisition functions in Gaussian Processes [0.0]
This paper analyzes different acquistion functions like Probability of Maximum Improvement and Expected Improvement.
Along with the analysis of time taken, the paper also shows the importance of position of initial samples chosen.
arXiv Detail & Related papers (2021-11-09T03:25:15Z) - An Accelerated Variance-Reduced Conditional Gradient Sliding Algorithm
for First-order and Zeroth-order Optimization [111.24899593052851]
Conditional gradient algorithm (also known as the Frank-Wolfe algorithm) has recently regained popularity in the machine learning community.
ARCS is the first zeroth-order conditional gradient sliding type algorithms solving convex problems in zeroth-order optimization.
In first-order optimization, the convergence results of ARCS substantially outperform previous algorithms in terms of the number of gradient query oracle.
arXiv Detail & Related papers (2021-09-18T07:08:11Z) - Bayesian Algorithm Execution: Estimating Computable Properties of
Black-box Functions Using Mutual Information [78.78486761923855]
In many real world problems, we want to infer some property of an expensive black-box function f, given a budget of T function evaluations.
We present a procedure, InfoBAX, that sequentially chooses queries that maximize mutual information with respect to the algorithm's output.
On these problems, InfoBAX uses up to 500 times fewer queries to f than required by the original algorithm.
arXiv Detail & Related papers (2021-04-19T17:22:11Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Exploiting Higher Order Smoothness in Derivative-free Optimization and
Continuous Bandits [99.70167985955352]
We study the problem of zero-order optimization of a strongly convex function.
We consider a randomized approximation of the projected gradient descent algorithm.
Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters.
arXiv Detail & Related papers (2020-06-14T10:42:23Z) - Incorporating Expert Prior in Bayesian Optimisation via Space Warping [54.412024556499254]
In big search spaces the algorithm goes through several low function value regions before reaching the optimum of the function.
One approach to subside this cold start phase is to use prior knowledge that can accelerate the optimisation.
In this paper, we represent the prior knowledge about the function optimum through a prior distribution.
The prior distribution is then used to warp the search space in such a way that space gets expanded around the high probability region of function optimum and shrinks around low probability region of optimum.
arXiv Detail & Related papers (2020-03-27T06:18:49Z) - Learning to be Global Optimizer [28.88646928299302]
We learn an optimal network and escaping capability algorithm for some benchmark functions.
We show that the learned algorithm significantly outperforms some well-known classical optimization algorithms.
arXiv Detail & Related papers (2020-03-10T03:46:25Z) - Differentiable Top-k Operator with Optimal Transport [135.36099648554054]
The SOFT top-k operator approximates the output of the top-k operation as the solution of an Entropic Optimal Transport (EOT) problem.
We apply the proposed operator to the k-nearest neighbors and beam search algorithms, and demonstrate improved performance.
arXiv Detail & Related papers (2020-02-16T04:57:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.