Related papers: On the Sublinear Regret of GP-UCB

On the Sublinear Regret of GP-UCB

URL: http://arxiv.org/abs/2307.07539v2
Date: Mon, 14 Aug 2023 17:22:21 GMT
Title: On the Sublinear Regret of GP-UCB
Authors: Justin Whitehouse, Zhiwei Steven Wu, Aaditya Ramdas
Abstract summary: We show that the Gaussian Process Upper Confidence Bound (GP-UCB) algorithm enjoys nearly optimal regret rates. Our improvements rely on a key technical contribution -- regularizing kernel ridge estimators in proportion to the smoothness of the underlying kernel.
Score: 58.25014663727544
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the kernelized bandit problem, a learner aims to sequentially compute the optimum of a function lying in a reproducing kernel Hilbert space given only noisy evaluations at sequentially chosen points. In particular, the learner aims to minimize regret, which is a measure of the suboptimality of the choices made. Arguably the most popular algorithm is the Gaussian Process Upper Confidence Bound (GP-UCB) algorithm, which involves acting based on a simple linear estimator of the unknown function. Despite its popularity, existing analyses of GP-UCB give a suboptimal regret rate, which fails to be sublinear for many commonly used kernels such as the Mat\'ern kernel. This has led to a longstanding open question: are existing regret analyses for GP-UCB tight, or can bounds be improved by using more sophisticated analytical techniques? In this work, we resolve this open question and show that GP-UCB enjoys nearly optimal regret. In particular, our results yield sublinear regret rates for the Mat\'ern kernel, improving over the state-of-the-art analyses and partially resolving a COLT open problem posed by Vakili et al. Our improvements rely on a key technical contribution -- regularizing kernel ridge estimators in proportion to the smoothness of the underlying kernel $k$. Applying this key idea together with a largely overlooked concentration result in separable Hilbert spaces (for which we provide an independent, simplified derivation), we are able to provide a tighter analysis of the GP-UCB algorithm.

Related papers

Improved Regret Bounds for Gaussian Process Upper Confidence Bound in Bayesian Optimization [3.6985338895569204]
We show that the Gaussian process GP-UCB algorithm achieves $tildeO(sqrtT)$ cumulative regret with high probability.<n>Our analysis yields $O(sqrtT ln2 T)$ regret under a squared exponential kernel.
arXiv Detail & Related papers (2025-06-02T07:38:58Z)
Gaussian Process Upper Confidence Bound Achieves Nearly-Optimal Regret in Noise-Free Gaussian Process Bandits [3.6985338895569204]
We show the nearly optimal regret upper bound of noise-free GP-UCB. Specifically, our analysis shows the first constant cumulative regret in the noise-free settings for the squared exponential kernel and Mat'ern kernel.
arXiv Detail & Related papers (2025-02-26T10:10:51Z)
Regret Analysis for Randomized Gaussian Process Upper Confidence Bound [9.967062483758632]
This paper analyzes the randomized variant of GP-UCB called improved randomized GP-UCB. In both regret analyses, IRGP-UCB achieves a sub-linear regret upper bound without increasing the confidence parameter if the input domain is finite.
arXiv Detail & Related papers (2024-09-02T06:49:29Z)
Regret Optimality of GP-UCB [12.323109084902228]
Gaussian Process Upper Confidence Bound (GP-UCB) is one of the most popular methods for optimizing black-box functions with noisy observations. We establish new upper bounds on both the simple and cumulative regret of GP-UCB when the objective function to optimize admits certain smoothness property. With the same level of exploration, GP-UCB can simultaneously achieve optimality in both simple and cumulative regret.
arXiv Detail & Related papers (2023-12-03T13:20:08Z)
Quantum Bayesian Optimization [64.58749619145908]
We introduce the quantum-Gaussian process-upper confidence bound (Q-GP-UCB) algorithm. It is the first BO algorithm able to achieve a regret upper bound of O(polylog T), which is significantly smaller than its regret lower bound of Omega(sqrt(T)) in the classical setting. Thanks to our novel analysis of the confidence ellipsoid, our Q-GP-UCB with the linear kernel achieves a smaller regret than the quantum linear UCB algorithm.
arXiv Detail & Related papers (2023-10-09T03:10:42Z)
Randomized Gaussian Process Upper Confidence Bound with Tighter Bayesian Regret Bounds [9.89553853547974]
This study first generalizes the regret analysis of RGP-UCB to a wider class of distributions, including the Gamma distribution. We propose improved RGP-UCB based on a two- parameter exponential distribution, which achieves tighter Bayesian regret bounds. We demonstrate the effectiveness of IRGP-UCB through extensive experiments.
arXiv Detail & Related papers (2023-02-03T02:48:48Z)
Sample-Then-Optimize Batch Neural Thompson Sampling [50.800944138278474]
We introduce two algorithms for black-box optimization based on the Thompson sampling (TS) policy. To choose an input query, we only need to train an NN and then choose the query by maximizing the trained NN. Our algorithms sidestep the need to invert the large parameter matrix yet still preserve the validity of the TS policy.
arXiv Detail & Related papers (2022-10-13T09:01:58Z)
On the Benefits of Large Learning Rates for Kernel Methods [110.03020563291788]
We show that a phenomenon can be precisely characterized in the context of kernel methods. We consider the minimization of a quadratic objective in a separable Hilbert space, and show that with early stopping, the choice of learning rate influences the spectral decomposition of the obtained solution.
arXiv Detail & Related papers (2022-02-28T13:01:04Z)
Meta-Learning Hypothesis Spaces for Sequential Decision-making [79.73213540203389]
We propose to meta-learn a kernel from offline data (Meta-KeL) Under mild conditions, we guarantee that our estimated RKHS yields valid confidence sets. We also empirically evaluate the effectiveness of our approach on a Bayesian optimization task.
arXiv Detail & Related papers (2022-02-01T17:46:51Z)
Misspecified Gaussian Process Bandit Optimization [59.30399661155574]
Kernelized bandit algorithms have shown strong empirical and theoretical performance for this problem. We introduce a emphmisspecified kernelized bandit setting where the unknown function can be $epsilon$--uniformly approximated by a function with a bounded norm in some Reproducing Kernel Hilbert Space (RKHS) We show that our algorithm achieves optimal dependence on $epsilon$ with no prior knowledge of misspecification.
arXiv Detail & Related papers (2021-11-09T09:00:02Z)
Weighted Gaussian Process Bandits for Non-stationary Environments [30.49357960656324]
We develop WGP-UCB, a novel UCB-type algorithm based on weighted Gaussian process regression. A key challenge is how to cope with infinite-dimensional feature maps. We provide universal upper bounds and weight-dependent upper bounds for weighted maximum information gains.
arXiv Detail & Related papers (2021-07-06T03:37:33Z)
No-Regret Algorithms for Time-Varying Bayesian Optimization [0.0]
We adopt the general variation budget model to capture the time-varying environment. We introduce two GP-UCB type algorithms: R-GP-UCB and SW-GP-UCB, respectively. Our results not only recover previous linear bandit results when a linear kernel is used, but complement the previous regret analysis of time-varying Gaussian process bandit.
arXiv Detail & Related papers (2021-02-11T22:35:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.