Linear Bandits on Uniformly Convex Sets
- URL: http://arxiv.org/abs/2103.05907v1
- Date: Wed, 10 Mar 2021 07:33:03 GMT
- Title: Linear Bandits on Uniformly Convex Sets
- Authors: Thomas Kerdreux, Christophe Roux, Alexandre d'Aspremont, Sebastian
Pokutta
- Abstract summary: Linear bandit algorithms yield $tildemathcalO(nsqrtT)$ pseudo-regret bounds on compact convex action sets.
Two types of structural assumptions lead to better pseudo-regret bounds.
- Score: 88.3673525964507
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Linear bandit algorithms yield $\tilde{\mathcal{O}}(n\sqrt{T})$ pseudo-regret
bounds on compact convex action sets $\mathcal{K}\subset\mathbb{R}^n$ and two
types of structural assumptions lead to better pseudo-regret bounds. When
$\mathcal{K}$ is the simplex or an $\ell_p$ ball with $p\in]1,2]$, there exist
bandits algorithms with $\tilde{\mathcal{O}}(\sqrt{nT})$ pseudo-regret bounds.
Here, we derive bandit algorithms for some strongly convex sets beyond $\ell_p$
balls that enjoy pseudo-regret bounds of $\tilde{\mathcal{O}}(\sqrt{nT})$,
which answers an open question from [BCB12, \S 5.5.]. Interestingly, when the
action set is uniformly convex but not necessarily strongly convex, we obtain
pseudo-regret bounds with a dimension dependency smaller than
$\mathcal{O}(\sqrt{n})$. However, this comes at the expense of asymptotic rates
in $T$ varying between $\tilde{\mathcal{O}}(\sqrt{T})$ and
$\tilde{\mathcal{O}}(T)$.
Related papers
- The Communication Complexity of Approximating Matrix Rank [50.6867896228563]
We show that this problem has randomized communication complexity $Omega(frac1kcdot n2log|mathbbF|)$.
As an application, we obtain an $Omega(frac1kcdot n2log|mathbbF|)$ space lower bound for any streaming algorithm with $k$ passes.
arXiv Detail & Related papers (2024-10-26T06:21:42Z) - Efficient Continual Finite-Sum Minimization [52.5238287567572]
We propose a key twist into the finite-sum minimization, dubbed as continual finite-sum minimization.
Our approach significantly improves upon the $mathcalO(n/epsilon)$ FOs that $mathrmStochasticGradientDescent$ requires.
We also prove that there is no natural first-order method with $mathcalOleft(n/epsilonalpharight)$ complexity gradient for $alpha 1/4$, establishing that the first-order complexity of our method is nearly tight.
arXiv Detail & Related papers (2024-06-07T08:26:31Z) - Sample-Efficient Linear Regression with Self-Selection Bias [7.605563562103568]
We consider the problem of linear regression with self-selection bias in the unknown-index setting.
We provide a novel and near optimally sample-efficient (in terms of $k$) algorithm to recover $mathbfw_1,ldots,mathbfw_kin.
Our algorithm succeeds under significantly relaxed noise assumptions, and therefore also succeeds in the related setting of max-linear regression.
arXiv Detail & Related papers (2024-02-22T02:20:24Z) - Nearly Optimal Regret for Decentralized Online Convex Optimization [53.433398074919]
Decentralized online convex optimization (D-OCO) aims to minimize a sequence of global loss functions using only local computations and communications.
We develop novel D-OCO algorithms that can respectively reduce the regret bounds for convex and strongly convex functions.
Our algorithms are nearly optimal in terms of $T$, $n$, and $rho$.
arXiv Detail & Related papers (2024-02-14T13:44:16Z) - Minimax Regret for Bandit Convex Optimisation of Ridge Functions [34.686687996497525]
We analyse adversarial bandit convex optimisation with an adversary that is restricted to playing functions of the form $f(x) = g(langle x, thetarangle)$ for convex $g : mathbb R to mathbb R$ and $theta in mathbb Rd$.
We provide a short information-theoretic proof that the minimax regret is at most $O(dsqrtn log(operatornamediammathcal K))$ where $n
arXiv Detail & Related papers (2021-06-01T12:51:48Z) - Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure.
We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds.
We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z) - Optimal Strategies for Graph-Structured Bandits [0.0]
We study a structured variant of the multi-armed bandit problem specified by a set of Bernoulli $!=!(nu_a,b)_a in mathcalA, b in mathcalB$ with means $(mu_a,b)_a in mathcalA, b in mathcalB$ with means $(mu_a,b)_a
arXiv Detail & Related papers (2020-07-07T06:27:51Z) - On the Complexity of Minimizing Convex Finite Sums Without Using the
Indices of the Individual Functions [62.01594253618911]
We exploit the finite noise structure of finite sums to derive a matching $O(n2)$-upper bound under the global oracle model.
Following a similar approach, we propose a novel adaptation of SVRG which is both emphcompatible with oracles, and achieves complexity bounds of $tildeO(n2+nsqrtL/mu)log (1/epsilon)$ and $O(nsqrtL/epsilon)$, for $mu>0$ and $mu=0$
arXiv Detail & Related papers (2020-02-09T03:39:46Z) - Some convergent results for Backtracking Gradient Descent method on
Banach spaces [0.0]
bf Theorem. Let $X$ be a Banach space and $f:Xrightarrow mathbbR$ be a $C2$ function.
Let $mathcalC$ be the set of critical points of $f$.
arXiv Detail & Related papers (2020-01-16T12:49:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.