Related papers: Practical First-Order Bayesian Optimization Algorithms

Practical First-Order Bayesian Optimization Algorithms

URL: http://arxiv.org/abs/2306.10815v1
Date: Mon, 19 Jun 2023 10:05:41 GMT
Title: Practical First-Order Bayesian Optimization Algorithms
Authors: Utkarsh Prakash, Aryan Chollera, Kushagra Khatwani, Prabuchandran K. J. and Tejas Bodas
Abstract summary: We propose a class of practical FOBO algorithms that efficiently utilize the information from the gradient GP to identify potential query points with zero gradients. We validate the performance of our proposed algorithms on several test functions and show that our algorithms outperform state-of-the-art FOBO algorithms.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: First Order Bayesian Optimization (FOBO) is a sample efficient sequential approach to find the global maxima of an expensive-to-evaluate black-box objective function by suitably querying for the function and its gradient evaluations. Such methods assume Gaussian process (GP) models for both, the function and its gradient, and use them to construct an acquisition function that identifies the next query point. In this paper, we propose a class of practical FOBO algorithms that efficiently utilizes the information from the gradient GP to identify potential query points with zero gradients. We construct a multi-level acquisition function where in the first step, we optimize a lower level acquisition function with multiple restarts to identify potential query points with zero gradient value. We then use the upper level acquisition function to rank these query points based on their function values to potentially identify the global maxima. As a final step, the potential point of maxima is chosen as the actual query point. We validate the performance of our proposed algorithms on several test functions and show that our algorithms outperform state-of-the-art FOBO algorithms. We also illustrate the application of our algorithms in finding optimal set of hyper-parameters in machine learning and in learning the optimal policy in reinforcement learning tasks.

Related papers

Experience in Engineering Complex Systems: Active Preference Learning with Multiple Outcomes and Certainty Levels [1.5257326975704795]
Black-box optimization refers to the problem whose objective function and/or constraint sets are either unknown, inaccessible, or non-existent. The algorithm so-called Active Preference Learning has been developed to exploit this specific information. Our approach aims to extend the algorithm in such a way that can exploit further information effectively.
arXiv Detail & Related papers (2023-02-27T15:55:37Z)
Ensemble-based gradient inference for particle methods in optimization and sampling [2.9005223064604078]
We propose an approach based on function evaluations and Bayesian inference to extract higher-order differential information. We suggest to use this information for the improvement of established ensemble-based numerical methods for optimization and sampling.
arXiv Detail & Related papers (2022-09-23T09:21:35Z)
Optimizing Bayesian acquisition functions in Gaussian Processes [0.0]
This paper analyzes different acquistion functions like Probability of Maximum Improvement and Expected Improvement. Along with the analysis of time taken, the paper also shows the importance of position of initial samples chosen.
arXiv Detail & Related papers (2021-11-09T03:25:15Z)
An Accelerated Variance-Reduced Conditional Gradient Sliding Algorithm for First-order and Zeroth-order Optimization [111.24899593052851]
Conditional gradient algorithm (also known as the Frank-Wolfe algorithm) has recently regained popularity in the machine learning community. ARCS is the first zeroth-order conditional gradient sliding type algorithms solving convex problems in zeroth-order optimization. In first-order optimization, the convergence results of ARCS substantially outperform previous algorithms in terms of the number of gradient query oracle.
arXiv Detail & Related papers (2021-09-18T07:08:11Z)
Bayesian Algorithm Execution: Estimating Computable Properties of Black-box Functions Using Mutual Information [78.78486761923855]
In many real world problems, we want to infer some property of an expensive black-box function f, given a budget of T function evaluations. We present a procedure, InfoBAX, that sequentially chooses queries that maximize mutual information with respect to the algorithm's output. On these problems, InfoBAX uses up to 500 times fewer queries to f than required by the original algorithm.
arXiv Detail & Related papers (2021-04-19T17:22:11Z)
Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information. We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z)
Exploiting Higher Order Smoothness in Derivative-free Optimization and Continuous Bandits [99.70167985955352]
We study the problem of zero-order optimization of a strongly convex function. We consider a randomized approximation of the projected gradient descent algorithm. Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters.
arXiv Detail & Related papers (2020-06-14T10:42:23Z)
Incorporating Expert Prior in Bayesian Optimisation via Space Warping [54.412024556499254]
In big search spaces the algorithm goes through several low function value regions before reaching the optimum of the function. One approach to subside this cold start phase is to use prior knowledge that can accelerate the optimisation. In this paper, we represent the prior knowledge about the function optimum through a prior distribution. The prior distribution is then used to warp the search space in such a way that space gets expanded around the high probability region of function optimum and shrinks around low probability region of optimum.
arXiv Detail & Related papers (2020-03-27T06:18:49Z)
Learning to be Global Optimizer [28.88646928299302]
We learn an optimal network and escaping capability algorithm for some benchmark functions. We show that the learned algorithm significantly outperforms some well-known classical optimization algorithms.
arXiv Detail & Related papers (2020-03-10T03:46:25Z)
Differentiable Top-k Operator with Optimal Transport [135.36099648554054]
The SOFT top-k operator approximates the output of the top-k operation as the solution of an Entropic Optimal Transport (EOT) problem. We apply the proposed operator to the k-nearest neighbors and beam search algorithms, and demonstrate improved performance.
arXiv Detail & Related papers (2020-02-16T04:57:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.