Improved Regret Bounds for Gaussian Process Upper Confidence Bound in Bayesian Optimization
- URL: http://arxiv.org/abs/2506.01393v2
- Date: Sun, 13 Jul 2025 05:03:05 GMT
- Title: Improved Regret Bounds for Gaussian Process Upper Confidence Bound in Bayesian Optimization
- Authors: Shogo Iwazaki,
- Abstract summary: We show that the Gaussian process GP-UCB algorithm achieves $tildeO(sqrtT)$ cumulative regret with high probability.<n>Our analysis yields $O(sqrtT ln2 T)$ regret under a squared exponential kernel.
- Score: 3.6985338895569204
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the Bayesian optimization problem (also referred to as the Bayesian setting of the Gaussian process bandit), where the learner seeks to minimize the regret under a function drawn from a known Gaussian process (GP). Under a Mat\'ern kernel with a certain degree of smoothness, we show that the Gaussian process upper confidence bound (GP-UCB) algorithm achieves $\tilde{O}(\sqrt{T})$ cumulative regret with high probability. Furthermore, our analysis yields $O(\sqrt{T \ln^2 T})$ regret under a squared exponential kernel. These results fill the gap between the existing regret upper bound for GP-UCB and the best-known bound provided by Scarlett (2018). The key idea in our proof is to capture the concentration behavior of the input sequence realized by GP-UCB, enabling a more refined analysis of the GP's information gain.
Related papers
- Gaussian Process Upper Confidence Bound Achieves Nearly-Optimal Regret in Noise-Free Gaussian Process Bandits [3.6985338895569204]
We show the nearly optimal regret upper bound of noise-free GP-UCB.<n>Specifically, our analysis shows the first constant cumulative regret in the noise-free settings for the squared exponential kernel and Mat'ern kernel.
arXiv Detail & Related papers (2025-02-26T10:10:51Z) - Regret Analysis for Randomized Gaussian Process Upper Confidence Bound [9.967062483758632]
This paper analyzes the randomized variant of GP-UCB called improved randomized GP-UCB.
In both regret analyses, IRGP-UCB achieves a sub-linear regret upper bound without increasing the confidence parameter if the input domain is finite.
arXiv Detail & Related papers (2024-09-02T06:49:29Z) - On the Sublinear Regret of GP-UCB [58.25014663727544]
We show that the Gaussian Process Upper Confidence Bound (GP-UCB) algorithm enjoys nearly optimal regret rates.
Our improvements rely on a key technical contribution -- regularizing kernel ridge estimators in proportion to the smoothness of the underlying kernel.
arXiv Detail & Related papers (2023-07-14T13:56:11Z) - Randomized Gaussian Process Upper Confidence Bound with Tighter Bayesian
Regret Bounds [9.89553853547974]
This study first generalizes the regret analysis of RGP-UCB to a wider class of distributions, including the Gamma distribution.
We propose improved RGP-UCB based on a two- parameter exponential distribution, which achieves tighter Bayesian regret bounds.
We demonstrate the effectiveness of IRGP-UCB through extensive experiments.
arXiv Detail & Related papers (2023-02-03T02:48:48Z) - Regret Bounds for Expected Improvement Algorithms in Gaussian Process
Bandit Optimization [63.8557841188626]
The expected improvement (EI) algorithm is one of the most popular strategies for optimization under uncertainty.
We propose a variant of EI with a standard incumbent defined via the GP predictive mean.
We show that our algorithm converges, and achieves a cumulative regret bound of $mathcal O(gamma_TsqrtT)$.
arXiv Detail & Related papers (2022-03-15T13:17:53Z) - Misspecified Gaussian Process Bandit Optimization [59.30399661155574]
Kernelized bandit algorithms have shown strong empirical and theoretical performance for this problem.
We introduce a emphmisspecified kernelized bandit setting where the unknown function can be $epsilon$--uniformly approximated by a function with a bounded norm in some Reproducing Kernel Hilbert Space (RKHS)
We show that our algorithm achieves optimal dependence on $epsilon$ with no prior knowledge of misspecification.
arXiv Detail & Related papers (2021-11-09T09:00:02Z) - Gaussian Process Bandit Optimization with Few Batches [49.896920704012395]
We introduce a batch algorithm inspired by finite-arm bandit algorithms.
We show that it achieves the cumulative regret upper bound $Oast(sqrtTgamma_T)$ using $O(loglog T)$ batches within time horizon $T$.
In addition, we propose a modified version of our algorithm, and characterize how the regret is impacted by the number of batches.
arXiv Detail & Related papers (2021-10-15T00:54:04Z) - Randomised Gaussian Process Upper Confidence Bound for Bayesian
Optimisation [60.93091603232817]
We develop a modified Gaussian process upper confidence bound (GP-UCB) acquisition function.
This is done by sampling the exploration-exploitation trade-off parameter from a distribution.
We prove that this allows the expected trade-off parameter to be altered to better suit the problem without compromising a bound on the function's Bayesian regret.
arXiv Detail & Related papers (2020-06-08T00:28:41Z) - Regret and Belief Complexity Trade-off in Gaussian Process Bandits via
Information Thresholding [42.669970064867556]
We show how to characterize the trade-off between regret bounds of GP bandit algorithms and complexity of the posterior distributions.
We observe state of the art accuracy and complexity trade-offs for GP bandit algorithms applied to global optimization.
arXiv Detail & Related papers (2020-03-23T21:05:15Z) - Optimistic Policy Optimization with Bandit Feedback [70.75568142146493]
We propose an optimistic trust region policy optimization (TRPO) algorithm for which we establish $tilde O(sqrtS2 A H4 K)$ regret for previous rewards.
To the best of our knowledge, the two results are the first sub-linear regret bounds obtained for policy optimization algorithms with unknown transitions and bandit feedback.
arXiv Detail & Related papers (2020-02-19T15:41:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.