Parameter-free Regret in High Probability with Heavy Tails
- URL: http://arxiv.org/abs/2210.14355v1
- Date: Tue, 25 Oct 2022 21:43:44 GMT
- Title: Parameter-free Regret in High Probability with Heavy Tails
- Authors: Jiujia Zhang, Ashok Cutkosky
- Abstract summary: We present new algorithms for online convex optimization over unbounded domains.
We obtain parameter-free regret in high-probability given access only to potentially heavy-tailed subgradient estimates.
- Score: 45.11667443216861
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present new algorithms for online convex optimization over unbounded
domains that obtain parameter-free regret in high-probability given access only
to potentially heavy-tailed subgradient estimates. Previous work in unbounded
domains considers only in-expectation results for sub-exponential subgradients.
Unlike in the bounded domain case, we cannot rely on straight-forward
martingale concentration due to exponentially large iterates produced by the
algorithm. We develop new regularization techniques to overcome these problems.
Overall, with probability at most $\delta$, for all comparators $\mathbf{u}$
our algorithm achieves regret $\tilde{O}(\| \mathbf{u} \| T^{1/\mathfrak{p}}
\log (1/\delta))$ for subgradients with bounded $\mathfrak{p}^{th}$ moments for
some $\mathfrak{p} \in (1, 2]$.
Related papers
- High Probability Guarantees for Random Reshuffling [5.663909018247509]
We consider the gradient method with random reshuffling ($mathsfRR$) for tackling nonmathp optimization problems.
In this work, we first investigate the neural sample complexity for $mathsfRR$s sampling procedure.
Then, we design a random reshuffling method ($mathsfp$mathsfRR$) that involves an additional randomized perturbation procedure stationary points.
arXiv Detail & Related papers (2023-11-20T15:17:20Z) - Private Isotonic Regression [54.32252900997422]
We consider the problem of isotonic regression over a partially ordered set (poset) $mathcalX$ and for any Lipschitz loss function.
We obtain a pure-DP algorithm that has an expected excess empirical risk of roughly $mathrmwidth(mathcalX) cdot log|mathcalX| / n$, where $mathrmwidth(mathcalX)$ is the width of the poset.
We show that the bounds above are essentially the best that can be
arXiv Detail & Related papers (2022-10-27T05:08:07Z) - Frank Wolfe Meets Metric Entropy [0.0]
We provide a technique for establishing domain specific and easy-to-estimate lower bounds for Frank-Wolfe.
We show that a dimension-free linear upper bound must fail not only in the worst case, but in the emph average case.
We also establish this phenomenon for the nuclear norm ball.
arXiv Detail & Related papers (2022-05-17T21:23:36Z) - Almost Optimal Batch-Regret Tradeoff for Batch Linear Contextual Bandits [45.43968161616453]
We study the optimal batch-regret tradeoff for batch linear contextual bandits.
We prove its regret guarantee, which features a two-phase expression as the time horizon $T$ grows.
We also prove a new matrix inequality concentration with dependence on their dynamic upper bounds.
arXiv Detail & Related papers (2021-10-15T12:32:33Z) - Gaussian Process Bandit Optimization with Few Batches [49.896920704012395]
We introduce a batch algorithm inspired by finite-arm bandit algorithms.
We show that it achieves the cumulative regret upper bound $Oast(sqrtTgamma_T)$ using $O(loglog T)$ batches within time horizon $T$.
In addition, we propose a modified version of our algorithm, and characterize how the regret is impacted by the number of batches.
arXiv Detail & Related papers (2021-10-15T00:54:04Z) - Bayesian Optimistic Optimisation with Exponentially Decaying Regret [58.02542541410322]
The current practical BO algorithms have regret bounds ranging from $mathcalO(fraclogNsqrtN)$ to $mathcal O(e-sqrtN)$, where $N$ is the number of evaluations.
This paper explores the possibility of improving the regret bound in the noiseless setting by intertwining concepts from BO and tree-based optimistic optimisation.
We propose the BOO algorithm, a first practical approach which can achieve an exponential regret bound with order $mathcal O(N-sqrt
arXiv Detail & Related papers (2021-05-10T13:07:44Z) - Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure.
We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds.
We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z) - On the Last Iterate Convergence of Momentum Methods [32.60494006038902]
We prove for the first time that for any constant momentum factor, there exists a Lipschitz and convex function for which the last iterate suffers from an error.
We show that in the setting with convex and smooth functions, our new SGDM algorithm automatically converges at a rate of $O(fraclog TT)$.
arXiv Detail & Related papers (2021-02-13T21:16:16Z) - On Convergence of Gradient Expected Sarsa($\lambda$) [25.983112169543375]
We show that applying the off-line estimate (multi-step bootstrapping) to $mathttExpectedSarsa(lambda)$ is unstable for off-policy learning.
We propose a convergent $mathttGradientExpectedSarsa(lambda)$ ($mathttGES(lambda)$) algorithm.
arXiv Detail & Related papers (2020-12-14T01:27:24Z) - Thresholded Lasso Bandit [70.17389393497125]
Thresholded Lasso bandit is an algorithm that estimates the vector defining the reward function as well as its sparse support.
We establish non-asymptotic regret upper bounds scaling as $mathcalO( log d + sqrtT )$ in general, and as $mathcalO( log d + sqrtT )$ under the so-called margin condition.
arXiv Detail & Related papers (2020-10-22T19:14:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.