Related papers: Instance-Dependent Regret Analysis of Kernelized Bandits

Instance-Dependent Regret Analysis of Kernelized Bandits

URL: http://arxiv.org/abs/2203.06297v1
Date: Sat, 12 Mar 2022 00:53:59 GMT
Title: Instance-Dependent Regret Analysis of Kernelized Bandits
Authors: Shubhanshu Shekhar, Tara Javidi
Abstract summary: We study the kernelized bandit problem, that involves designing an adaptive strategy for querying a noisy zeroth-order-oracle. We derive emphinstance-dependent regret lower bounds for algorithms with uniformly(over the function class) vanishing normalized cumulative regret.
Score: 19.252319300590653
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the kernelized bandit problem, that involves designing an adaptive strategy for querying a noisy zeroth-order-oracle to efficiently learn about the optimizer of an unknown function $f$ with a norm bounded by $M<\infty$ in a Reproducing Kernel Hilbert Space~(RKHS) associated with a positive definite kernel $K$. Prior results, working in a \emph{minimax framework}, have characterized the worst-case~(over all functions in the problem class) limits on regret achievable by \emph{any} algorithm, and have constructed algorithms with matching~(modulo polylogarithmic factors) worst-case performance for the \matern family of kernels. These results suffer from two drawbacks. First, the minimax lower bound gives no information about the limits of regret achievable by the commonly used algorithms on specific problem instances. Second, due to their worst-case nature, the existing upper bound analysis fails to adapt to easier problem instances within the function class. Our work takes steps to address both these issues. First, we derive \emph{instance-dependent} regret lower bounds for algorithms with uniformly~(over the function class) vanishing normalized cumulative regret. Our result, valid for all the practically relevant kernelized bandits algorithms, such as, GP-UCB, GP-TS and SupKernelUCB, identifies a fundamental complexity measure associated with every problem instance. We then address the second issue, by proposing a new minimax near-optimal algorithm which also adapts to easier problem instances.

Related papers

Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel. An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost. Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z)
Second Order Methods for Bandit Optimization and Control [34.51425758864638]
We show that our algorithm achieves optimal (in terms of terms of convex functions that we call $kappa$-2020) regret bounds for a large class of convex functions. We also investigate the adaptation of our second-order bandit algorithm to online convex optimization with memory.
arXiv Detail & Related papers (2024-02-14T04:03:38Z)
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z)
Decentralized Riemannian Algorithm for Nonconvex Minimax Problems [82.50374560598493]
The minimax algorithms for neural networks have been developed to solve many problems. In this paper, we propose two types of minimax algorithms. For the setting, we propose DRSGDA and prove that our method achieves a gradient.
arXiv Detail & Related papers (2023-02-08T01:42:45Z)
A Single-Loop Gradient Descent and Perturbed Ascent Algorithm for Nonconvex Functional Constrained Optimization [27.07082875330508]
Non constrained inequality problems can be used to model a number machine learning problems, such as multi-class Neyman oracle. Under such a mild condition of regularity it is difficult to balance reduction alternating value loss and reduction constraint violation. In this paper, we propose a novel primal-dual inequality constrained problems algorithm.
arXiv Detail & Related papers (2022-07-12T16:30:34Z)
Asymptotic Instance-Optimal Algorithms for Interactive Decision Making [35.76184529520015]
In this paper, we design the first instance-optimal algorithm for general interactive decision making problems with finite number of decisions under mild conditions. Our results, instantiated on concrete problems, recover the classical gap-dependent bounds for multi-armed bandits and prior works on linear bandits.
arXiv Detail & Related papers (2022-06-06T02:56:10Z)
Misspecified Gaussian Process Bandit Optimization [59.30399661155574]
Kernelized bandit algorithms have shown strong empirical and theoretical performance for this problem. We introduce a emphmisspecified kernelized bandit setting where the unknown function can be $epsilon$--uniformly approximated by a function with a bounded norm in some Reproducing Kernel Hilbert Space (RKHS) We show that our algorithm achieves optimal dependence on $epsilon$ with no prior knowledge of misspecification.
arXiv Detail & Related papers (2021-11-09T09:00:02Z)
Gaussian Process Bandit Optimization with Few Batches [49.896920704012395]
We introduce a batch algorithm inspired by finite-arm bandit algorithms. We show that it achieves the cumulative regret upper bound $Oast(sqrtTgamma_T)$ using $O(loglog T)$ batches within time horizon $T$. In addition, we propose a modified version of our algorithm, and characterize how the regret is impacted by the number of batches.
arXiv Detail & Related papers (2021-10-15T00:54:04Z)
A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic [142.1492359556374]
Bilevel optimization is a class of problems which exhibit a two-level structure. We propose a two-timescale approximation (TTSA) algorithm for tackling such a bilevel problem. We show that a two-timescale natural actor-critic policy optimization algorithm can be viewed as a special case of our TTSA framework.
arXiv Detail & Related papers (2020-07-10T05:20:02Z)
Multi-Scale Zero-Order Optimization of Smooth Functions in an RKHS [19.252319300590653]
A black-box function $f:mathcalX mapsto mathbbR$ is optimized under the assumption that $f$ is H"older smooth and has bounded norm in the RKHS associated with a given kernel $K$. In this paper, we propose a new algorithm (textttLP-GP-UCB) where the usual GP surrogate model is augmented with Local Polynomial (LP) estimators of the H" smooth parameter $f$.
arXiv Detail & Related papers (2020-05-11T01:55:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.