Order-Optimal Error Bounds for Noisy Kernel-Based Bayesian Quadrature
- URL: http://arxiv.org/abs/2202.10615v1
- Date: Tue, 22 Feb 2022 01:49:41 GMT
- Title: Order-Optimal Error Bounds for Noisy Kernel-Based Bayesian Quadrature
- Authors: Xu Cai, Chi Thanh Lam, and Jonathan Scarlett
- Abstract summary: We consider functions in a em Reproducing Kernel Hilbert Space (RKHS) with the Mat'ern-$nu$ kernel.
We find that when the black-box queries are subject to Gaussian noise having variance $sigma2$, any algorithm making at most $T$ queries must incur a mean absolute error of $Omega(T-fracnud-1 + sigma T-frac12)$.
- Score: 42.129843613950285
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study the sample complexity of {\em noisy Bayesian
quadrature} (BQ), in which we seek to approximate an integral based on noisy
black-box queries to the underlying function. We consider functions in a {\em
Reproducing Kernel Hilbert Space} (RKHS) with the Mat\'ern-$\nu$ kernel,
focusing on combinations of the parameter $\nu$ and dimension $d$ such that the
RKHS is equivalent to a Sobolev class. In this setting, we provide
near-matching upper and lower bounds on the best possible average error.
Specifically, we find that when the black-box queries are subject to Gaussian
noise having variance $\sigma^2$, any algorithm making at most $T$ queries
(even with adaptive sampling) must incur a mean absolute error of
$\Omega(T^{-\frac{\nu}{d}-1} + \sigma T^{-\frac{1}{2}})$, and there exists a
non-adaptive algorithm attaining an error of at most $O(T^{-\frac{\nu}{d}-1} +
\sigma T^{-\frac{1}{2}})$. Hence, the bounds are order-optimal, and establish
that there is no adaptivity gap in terms of scaling laws.
Related papers
- Rényi-infinity constrained sampling with $d^3$ membership queries [2.209921757303168]
We propose a constrained proximal sampler, a principled and simple algorithm that possesses elegant convergence guarantees.
We show that it converges in the R'enyi-infinity divergence ($mathcal R_infty$) with no query complexity overhead when starting from a warm start.
arXiv Detail & Related papers (2024-07-17T19:20:08Z) - Kernelized Normalizing Constant Estimation: Bridging Bayesian Quadrature
and Bayesian Optimization [51.533164528799084]
We show that to estimate the normalizing constant within a small relative error, the level of difficulty depends on the value of $lambda$.
We find that this pattern holds true even when the function evaluations are noisy.
arXiv Detail & Related papers (2024-01-11T07:45:09Z) - Near-Optimal Bounds for Learning Gaussian Halfspaces with Random
Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$.
Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z) - An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue
Optimization Problems [76.2042837251496]
We introduce two oblivious mirror descent algorithms based on a complementary composite setting.
Remarkably, both algorithms work without prior knowledge of the Lipschitz constant or smoothness of the objective function.
We show how to extend our framework to scale and demonstrate the efficiency and robustness of our methods on large scale semidefinite programs.
arXiv Detail & Related papers (2023-06-30T08:34:29Z) - Beyond Uniform Smoothness: A Stopped Analysis of Adaptive SGD [38.221784575853796]
This work considers the problem of finding first-order stationary point of a non atau function with potentially constant smoothness using a gradient.
We develop a technique that allows us to prove $mathcalO(fracmathrmpolylog(T)sigmatT)$ convergence rates without assuming uniform bounds on the noise.
arXiv Detail & Related papers (2023-02-13T18:13:36Z) - An Optimal Stochastic Algorithm for Decentralized Nonconvex Finite-sum
Optimization [25.21457349137344]
We show a proof to show DEAREST requires at most $mathcal O(+sqrtmnLvarepsilon-2)$ first-order oracle (IFO) calls and $mathcal O(Lvarepsilon-2/sqrt1-lambda_W)$ communication rounds.
arXiv Detail & Related papers (2022-10-25T11:37:11Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Tight Regret Bounds for Noisy Optimization of a Brownian Motion [118.65407541895851]
We consider the problem of Bayesian optimization of a one-dimensional Brownian motion in which the $T$ adaptively chosen observations are corrupted by Gaussian noise.
We show that as the smallest possible expected cumulative regret and the smallest possible expected simple regret scale as $Omega(sigmasqrtT cdot log T)$.
arXiv Detail & Related papers (2020-01-25T14:44:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.