Ada-BKB: Scalable Gaussian Process Optimization on Continuous Domain by
Adaptive Discretization
- URL: http://arxiv.org/abs/2106.08598v1
- Date: Wed, 16 Jun 2021 07:55:45 GMT
- Title: Ada-BKB: Scalable Gaussian Process Optimization on Continuous Domain by
Adaptive Discretization
- Authors: Marco Rando, Luigi Carratino, Silvia Villa and Lorenzo Rosasco
- Abstract summary: An algorithm such as GPUCB has prohibitive computational complexity.
A norere algorithm for functions corroborates the real problem of continuous optimization.
- Score: 21.859940486704264
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gaussian process optimization is a successful class of algorithms (e.g.
GP-UCB) to optimize a black-box function through sequential evaluations.
However, when the domain of the function is continuous, Gaussian process
optimization has to either rely on a fixed discretization of the space, or
solve a non-convex optimization subproblem at each evaluation. The first
approach can negatively affect performance, while the second one puts a heavy
computational burden on the algorithm. A third option, that only recently has
been theoretically studied, is to adaptively discretize the function domain.
Even though this approach avoids the extra non-convex optimization costs, the
overall computational complexity is still prohibitive. An algorithm such as
GP-UCB has a runtime of $O(T^4)$, where $T$ is the number of iterations. In
this paper, we introduce Ada-BKB (Adaptive Budgeted Kernelized Bandit), a
no-regret Gaussian process optimization algorithm for functions on continuous
domains, that provably runs in $O(T^2 d_\text{eff}^2)$, where $d_\text{eff}$ is
the effective dimension of the explored space, and which is typically much
smaller than $T$. We corroborate our findings with experiments on synthetic
non-convex functions and on the real-world problem of hyper-parameter
optimization.
Related papers
- Two-step Lookahead Bayesian Optimization with Inequality Constraints [21.703234193908038]
We propose a two-step lookahead constrained Bayesian optimization acquisition function (2-OPT-C) supporting both sequential and batch settings.
In numerical experiments, 2-OPT-C typically improves query efficiency by 2x or more over previous methods, and in some cases by 10x or more.
arXiv Detail & Related papers (2021-12-06T07:40:54Z) - Reducing the Variance of Gaussian Process Hyperparameter Optimization
with Preconditioning [54.01682318834995]
Preconditioning is a highly effective step for any iterative method involving matrix-vector multiplication.
We prove that preconditioning has an additional benefit that has been previously unexplored.
It simultaneously can reduce variance at essentially negligible cost.
arXiv Detail & Related papers (2021-07-01T06:43:11Z) - Bayesian Optimistic Optimisation with Exponentially Decaying Regret [58.02542541410322]
The current practical BO algorithms have regret bounds ranging from $mathcalO(fraclogNsqrtN)$ to $mathcal O(e-sqrtN)$, where $N$ is the number of evaluations.
This paper explores the possibility of improving the regret bound in the noiseless setting by intertwining concepts from BO and tree-based optimistic optimisation.
We propose the BOO algorithm, a first practical approach which can achieve an exponential regret bound with order $mathcal O(N-sqrt
arXiv Detail & Related papers (2021-05-10T13:07:44Z) - Bayesian Algorithm Execution: Estimating Computable Properties of
Black-box Functions Using Mutual Information [78.78486761923855]
In many real world problems, we want to infer some property of an expensive black-box function f, given a budget of T function evaluations.
We present a procedure, InfoBAX, that sequentially chooses queries that maximize mutual information with respect to the algorithm's output.
On these problems, InfoBAX uses up to 500 times fewer queries to f than required by the original algorithm.
arXiv Detail & Related papers (2021-04-19T17:22:11Z) - Recent Theoretical Advances in Non-Convex Optimization [56.88981258425256]
Motivated by recent increased interest in analysis of optimization algorithms for non- optimization in deep networks and other problems in data, we give an overview of recent results of theoretical optimization algorithms for non- optimization.
arXiv Detail & Related papers (2020-12-11T08:28:51Z) - A Domain-Shrinking based Bayesian Optimization Algorithm with
Order-Optimal Regret Performance [16.0251555430107]
This is the first GP-based algorithm with an order-optimal regret guarantee.
Compared with the prevailing GP-UCB family of algorithms, the proposed algorithm reduces computational complexity by a factor of $O(T2d-1)$.
arXiv Detail & Related papers (2020-10-27T02:15:15Z) - A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis
and Application to Actor-Critic [142.1492359556374]
Bilevel optimization is a class of problems which exhibit a two-level structure.
We propose a two-timescale approximation (TTSA) algorithm for tackling such a bilevel problem.
We show that a two-timescale natural actor-critic policy optimization algorithm can be viewed as a special case of our TTSA framework.
arXiv Detail & Related papers (2020-07-10T05:20:02Z) - Private Stochastic Convex Optimization: Optimal Rates in Linear Time [74.47681868973598]
We study the problem of minimizing the population loss given i.i.d. samples from a distribution over convex loss functions.
A recent work of Bassily et al. has established the optimal bound on the excess population loss achievable given $n$ samples.
We describe two new techniques for deriving convex optimization algorithms both achieving the optimal bound on excess loss and using $O(minn, n2/d)$ gradient computations.
arXiv Detail & Related papers (2020-05-10T19:52:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.