Instance-Dependent Regret Analysis of Kernelized Bandits
- URL: http://arxiv.org/abs/2203.06297v1
- Date: Sat, 12 Mar 2022 00:53:59 GMT
- Title: Instance-Dependent Regret Analysis of Kernelized Bandits
- Authors: Shubhanshu Shekhar, Tara Javidi
- Abstract summary: We study the kernelized bandit problem, that involves designing an adaptive strategy for querying a noisy zeroth-order-oracle.
We derive emphinstance-dependent regret lower bounds for algorithms with uniformly(over the function class) vanishing normalized cumulative regret.
- Score: 19.252319300590653
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the kernelized bandit problem, that involves designing an adaptive
strategy for querying a noisy zeroth-order-oracle to efficiently learn about
the optimizer of an unknown function $f$ with a norm bounded by $M<\infty$ in a
Reproducing Kernel Hilbert Space~(RKHS) associated with a positive definite
kernel $K$. Prior results, working in a \emph{minimax framework}, have
characterized the worst-case~(over all functions in the problem class) limits
on regret achievable by \emph{any} algorithm, and have constructed algorithms
with matching~(modulo polylogarithmic factors) worst-case performance for the
\matern family of kernels. These results suffer from two drawbacks. First, the
minimax lower bound gives no information about the limits of regret achievable
by the commonly used algorithms on specific problem instances. Second, due to
their worst-case nature, the existing upper bound analysis fails to adapt to
easier problem instances within the function class. Our work takes steps to
address both these issues. First, we derive \emph{instance-dependent} regret
lower bounds for algorithms with uniformly~(over the function class) vanishing
normalized cumulative regret. Our result, valid for all the practically
relevant kernelized bandits algorithms, such as, GP-UCB, GP-TS and
SupKernelUCB, identifies a fundamental complexity measure associated with every
problem instance. We then address the second issue, by proposing a new minimax
near-optimal algorithm which also adapts to easier problem instances.
Related papers
- Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic
Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel.
An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost.
Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z) - Second Order Methods for Bandit Optimization and Control [34.51425758864638]
We show that our algorithm achieves optimal (in terms of terms of convex functions that we call $kappa$-2020) regret bounds for a large class of convex functions.
We also investigate the adaptation of our second-order bandit algorithm to online convex optimization with memory.
arXiv Detail & Related papers (2024-02-14T04:03:38Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Decentralized Riemannian Algorithm for Nonconvex Minimax Problems [82.50374560598493]
The minimax algorithms for neural networks have been developed to solve many problems.
In this paper, we propose two types of minimax algorithms.
For the setting, we propose DRSGDA and prove that our method achieves a gradient.
arXiv Detail & Related papers (2023-02-08T01:42:45Z) - A Single-Loop Gradient Descent and Perturbed Ascent Algorithm for
Nonconvex Functional Constrained Optimization [27.07082875330508]
Non constrained inequality problems can be used to model a number machine learning problems, such as multi-class Neyman oracle.
Under such a mild condition of regularity it is difficult to balance reduction alternating value loss and reduction constraint violation.
In this paper, we propose a novel primal-dual inequality constrained problems algorithm.
arXiv Detail & Related papers (2022-07-12T16:30:34Z) - Asymptotic Instance-Optimal Algorithms for Interactive Decision Making [35.76184529520015]
In this paper, we design the first instance-optimal algorithm for general interactive decision making problems with finite number of decisions under mild conditions.
Our results, instantiated on concrete problems, recover the classical gap-dependent bounds for multi-armed bandits and prior works on linear bandits.
arXiv Detail & Related papers (2022-06-06T02:56:10Z) - Misspecified Gaussian Process Bandit Optimization [59.30399661155574]
Kernelized bandit algorithms have shown strong empirical and theoretical performance for this problem.
We introduce a emphmisspecified kernelized bandit setting where the unknown function can be $epsilon$--uniformly approximated by a function with a bounded norm in some Reproducing Kernel Hilbert Space (RKHS)
We show that our algorithm achieves optimal dependence on $epsilon$ with no prior knowledge of misspecification.
arXiv Detail & Related papers (2021-11-09T09:00:02Z) - Gaussian Process Bandit Optimization with Few Batches [49.896920704012395]
We introduce a batch algorithm inspired by finite-arm bandit algorithms.
We show that it achieves the cumulative regret upper bound $Oast(sqrtTgamma_T)$ using $O(loglog T)$ batches within time horizon $T$.
In addition, we propose a modified version of our algorithm, and characterize how the regret is impacted by the number of batches.
arXiv Detail & Related papers (2021-10-15T00:54:04Z) - A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis
and Application to Actor-Critic [142.1492359556374]
Bilevel optimization is a class of problems which exhibit a two-level structure.
We propose a two-timescale approximation (TTSA) algorithm for tackling such a bilevel problem.
We show that a two-timescale natural actor-critic policy optimization algorithm can be viewed as a special case of our TTSA framework.
arXiv Detail & Related papers (2020-07-10T05:20:02Z) - Multi-Scale Zero-Order Optimization of Smooth Functions in an RKHS [19.252319300590653]
A black-box function $f:mathcalX mapsto mathbbR$ is optimized under the assumption that $f$ is H"older smooth and has bounded norm in the RKHS associated with a given kernel $K$.
In this paper, we propose a new algorithm (textttLP-GP-UCB) where the usual GP surrogate model is augmented with Local Polynomial (LP) estimators of the H" smooth parameter $f$.
arXiv Detail & Related papers (2020-05-11T01:55:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.