Related papers: A Second-Order Method for Stochastic Bandit Convex Optimisation

A Second-Order Method for Stochastic Bandit Convex Optimisation

URL: http://arxiv.org/abs/2302.05371v1
Date: Fri, 10 Feb 2023 16:49:58 GMT
Title: A Second-Order Method for Stochastic Bandit Convex Optimisation
Authors: Tor Lattimore and Andr\'as Gy\"orgy
Abstract summary: We introduce a simple and efficient algorithm for unconstrained zeroth-order convex bandits. We prove its regret is at most $(1 + r/d)[d1.5 sqrtn + d3] polylog(n, d, r)$ where $n$ is the horizon, $d$ the dimension and $r$ is the radius of a known ball containing the minimiser of the loss.
Score: 28.93486346263364
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a simple and efficient algorithm for unconstrained zeroth-order stochastic convex bandits and prove its regret is at most $(1 + r/d)[d^{1.5} \sqrt{n} + d^3] polylog(n, d, r)$ where $n$ is the horizon, $d$ the dimension and $r$ is the radius of a known ball containing the minimiser of the loss.

Related papers

Heavy-Tailed Linear Bandits: Huber Regression with One-Pass Update [62.96781471194877]
Two principled strategies for handling heavy-tailed noise, truncation and median-of-means, have been introduced to heavy-tailed bandits.<n>We propose a emphone-pass algorithm based on the online mirror descent framework.
arXiv Detail & Related papers (2025-03-01T09:41:45Z)
A simple and improved algorithm for noisy, convex, zeroth-order optimisation [59.51990161522328]
We construct an algorithm that returns a point $hat xin barmathcal X$ such that $f(hat x)$ is as small as possible. We prove that this method is such that the $f(hat x) - min_xin barmathcal X f(x)$ is of smaller order than $d2/sqrtn$ up to poly-logarithmic terms.
arXiv Detail & Related papers (2024-06-26T18:19:10Z)
Online Newton Method for Bandit Convex Optimisation [28.66596225688161]
We introduce a computationally efficient algorithm for zeroth-order bandit convex optimisation. We prove that in the adversarial setting its regret is at most $d3.5 sqrtn mathrmpolylog(n, d)$ with high probability where $d$ is the time horizon. In the setting the bound improves to $M d2 sqrtn mathrmpolylog(n, d)$ where $M in [d-1/2, d-1 / 4]$ is
arXiv Detail & Related papers (2024-06-10T17:44:11Z)
Low-Rank Bandits via Tight Two-to-Infinity Singular Subspace Recovery [45.601316850669406]
We present efficient algorithms for policy evaluation, best policy identification and regret minimization. For policy evaluation and best policy identification, we show that our algorithms are nearly minimax optimal. All the proposed algorithms consist of two phases: they first leverage spectral methods to estimate the left and right singular subspaces of the low-rank reward matrix.
arXiv Detail & Related papers (2024-02-24T06:36:08Z)
Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel. An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost. Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z)
An Efficient Stochastic Algorithm for Decentralized Nonconvex-Strongly-Concave Minimax Optimization [25.00475462213752]
Decentralized Recursive desc. Method (DREAM) Concretely, it requires $mathcalO(minminappaappa3eps-3,kappa2 N)$ first-order oracle (SFO) calls and $tildemathcalO(kappa2 epsilon-2) communication rounds. Our numerical experiments validate the superiority of previous methods.
arXiv Detail & Related papers (2022-12-05T16:09:39Z)
Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model. The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z)
Variance-Aware Sparse Linear Bandits [64.70681598741417]
Worst-case minimax regret for sparse linear bandits is $widetildeThetaleft(sqrtdTright)$. In the benign setting where there is no noise and the action set is the unit sphere, one can use divide-and-conquer to achieve an $widetildemathcal O(1)$ regret. We develop a general framework that converts any variance-aware linear bandit algorithm to a variance-aware algorithm for sparse linear bandits.
arXiv Detail & Related papers (2022-05-26T15:55:44Z)
Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$ Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$ We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z)
Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations [54.42518331209581]
We find an algorithm which finds. epsilon$-approximate stationary point (with $|nabla F(x)|le epsilon$) using. $(epsilon,gamma)$surimate random random points. Our lower bounds here are novel even in the noiseless case.
arXiv Detail & Related papers (2020-06-24T04:41:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.