Related papers: Ada-BKB: Scalable Gaussian Process Optimization on Continuous Domain by Adaptive Discretization

Ada-BKB: Scalable Gaussian Process Optimization on Continuous Domain by Adaptive Discretization

URL: http://arxiv.org/abs/2106.08598v1
Date: Wed, 16 Jun 2021 07:55:45 GMT
Title: Ada-BKB: Scalable Gaussian Process Optimization on Continuous Domain by Adaptive Discretization
Authors: Marco Rando, Luigi Carratino, Silvia Villa and Lorenzo Rosasco
Abstract summary: An algorithm such as GPUCB has prohibitive computational complexity. A norere algorithm for functions corroborates the real problem of continuous optimization.
Score: 21.859940486704264
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Gaussian process optimization is a successful class of algorithms (e.g. GP-UCB) to optimize a black-box function through sequential evaluations. However, when the domain of the function is continuous, Gaussian process optimization has to either rely on a fixed discretization of the space, or solve a non-convex optimization subproblem at each evaluation. The first approach can negatively affect performance, while the second one puts a heavy computational burden on the algorithm. A third option, that only recently has been theoretically studied, is to adaptively discretize the function domain. Even though this approach avoids the extra non-convex optimization costs, the overall computational complexity is still prohibitive. An algorithm such as GP-UCB has a runtime of $O(T^4)$, where $T$ is the number of iterations. In this paper, we introduce Ada-BKB (Adaptive Budgeted Kernelized Bandit), a no-regret Gaussian process optimization algorithm for functions on continuous domains, that provably runs in $O(T^2 d_\text{eff}^2)$, where $d_\text{eff}$ is the effective dimension of the explored space, and which is typically much smaller than $T$. We corroborate our findings with experiments on synthetic non-convex functions and on the real-world problem of hyper-parameter optimization.

Related papers

Two-step Lookahead Bayesian Optimization with Inequality Constraints [21.703234193908038]
We propose a two-step lookahead constrained Bayesian optimization acquisition function (2-OPT-C) supporting both sequential and batch settings. In numerical experiments, 2-OPT-C typically improves query efficiency by 2x or more over previous methods, and in some cases by 10x or more.
arXiv Detail & Related papers (2021-12-06T07:40:54Z)
Reducing the Variance of Gaussian Process Hyperparameter Optimization with Preconditioning [54.01682318834995]
Preconditioning is a highly effective step for any iterative method involving matrix-vector multiplication. We prove that preconditioning has an additional benefit that has been previously unexplored. It simultaneously can reduce variance at essentially negligible cost.
arXiv Detail & Related papers (2021-07-01T06:43:11Z)
Bayesian Optimistic Optimisation with Exponentially Decaying Regret [58.02542541410322]
The current practical BO algorithms have regret bounds ranging from $mathcalO(fraclogNsqrtN)$ to $mathcal O(e-sqrtN)$, where $N$ is the number of evaluations. This paper explores the possibility of improving the regret bound in the noiseless setting by intertwining concepts from BO and tree-based optimistic optimisation. We propose the BOO algorithm, a first practical approach which can achieve an exponential regret bound with order $mathcal O(N-sqrt
arXiv Detail & Related papers (2021-05-10T13:07:44Z)
Bayesian Algorithm Execution: Estimating Computable Properties of Black-box Functions Using Mutual Information [78.78486761923855]
In many real world problems, we want to infer some property of an expensive black-box function f, given a budget of T function evaluations. We present a procedure, InfoBAX, that sequentially chooses queries that maximize mutual information with respect to the algorithm's output. On these problems, InfoBAX uses up to 500 times fewer queries to f than required by the original algorithm.
arXiv Detail & Related papers (2021-04-19T17:22:11Z)
Recent Theoretical Advances in Non-Convex Optimization [56.88981258425256]
Motivated by recent increased interest in analysis of optimization algorithms for non- optimization in deep networks and other problems in data, we give an overview of recent results of theoretical optimization algorithms for non- optimization.
arXiv Detail & Related papers (2020-12-11T08:28:51Z)
Block majorization-minimization with diminishing radius for constrained nonsmooth nonconvex optimization [8.386501595252]
Block majorization-minimativeization (BMM) is a simple iterative algorithm for constrained nonnegative surrogates.<n>We show that BMM produces a novel first-order optimality measure for various algorithms.<n>We also demonstrate that the additional use of diminishing radius can improve the convergence rate of BMM in many instances.
arXiv Detail & Related papers (2020-12-07T07:53:09Z)
A Domain-Shrinking based Bayesian Optimization Algorithm with Order-Optimal Regret Performance [16.0251555430107]
This is the first GP-based algorithm with an order-optimal regret guarantee. Compared with the prevailing GP-UCB family of algorithms, the proposed algorithm reduces computational complexity by a factor of $O(T2d-1)$.
arXiv Detail & Related papers (2020-10-27T02:15:15Z)
A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic [142.1492359556374]
Bilevel optimization is a class of problems which exhibit a two-level structure. We propose a two-timescale approximation (TTSA) algorithm for tackling such a bilevel problem. We show that a two-timescale natural actor-critic policy optimization algorithm can be viewed as a special case of our TTSA framework.
arXiv Detail & Related papers (2020-07-10T05:20:02Z)
Private Stochastic Convex Optimization: Optimal Rates in Linear Time [74.47681868973598]
We study the problem of minimizing the population loss given i.i.d. samples from a distribution over convex loss functions. A recent work of Bassily et al. has established the optimal bound on the excess population loss achievable given $n$ samples. We describe two new techniques for deriving convex optimization algorithms both achieving the optimal bound on excess loss and using $O(minn, n2/d)$ gradient computations.
arXiv Detail & Related papers (2020-05-10T19:52:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.