Unconstrained Online Learning with Unbounded Losses
- URL: http://arxiv.org/abs/2306.04923v2
- Date: Fri, 14 Jul 2023 20:44:43 GMT
- Title: Unconstrained Online Learning with Unbounded Losses
- Authors: Andrew Jacobsen, Ashok Cutkosky
- Abstract summary: We develop a new setting for online learning with unbounded domains and non-Lipschitz losses.
We provide an algorithm which guarantees $R_T(u)le tilde O(G|u|sqrtT+L|u|2sqrtT)$ regret on any problem where the subgradients satisfy $|g_t|le G+L|w_t|$.
- Score: 45.801991005821804
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Algorithms for online learning typically require one or more boundedness
assumptions: that the domain is bounded, that the losses are Lipschitz, or
both. In this paper, we develop a new setting for online learning with
unbounded domains and non-Lipschitz losses. For this setting we provide an
algorithm which guarantees $R_{T}(u)\le \tilde
O(G\|u\|\sqrt{T}+L\|u\|^{2}\sqrt{T})$ regret on any problem where the
subgradients satisfy $\|g_{t}\|\le G+L\|w_{t}\|$, and show that this bound is
unimprovable without further assumptions. We leverage this algorithm to develop
new saddle-point optimization algorithms that converge in duality gap in
unbounded domains, even in the absence of meaningful curvature. Finally, we
provide the first algorithm achieving non-trivial dynamic regret in an
unbounded domain for non-Lipschitz losses, as well as a matching lower bound.
The regret of our dynamic regret algorithm automatically improves to a novel
$L^{*}$ bound when the losses are smooth.
Related papers
- Unconstrained Robust Online Convex Optimization [38.9652176513385]
We focus on the difficult unconstrained'' setting in which our algorithm must maintain low regret.<n>Our algorithms guarantee regret $ |u|G (sqrtT + k) $ when $G ge max_t |g_t|$ is known, where $k$ is a measure of the total amount of corruption.
arXiv Detail & Related papers (2025-06-15T09:21:15Z) - An Equivalence Between Static and Dynamic Regret Minimization [10.812831455376218]
We show that for linear losses, dynamic regret minimization is equivalent to static regret minimization in an extended decision space.
We provide an algorithm guaranteeing dynamic regret of the form $R_T(u_1,dots,u_T)le tilde.
arXiv Detail & Related papers (2024-06-03T17:54:58Z) - Second Order Methods for Bandit Optimization and Control [34.51425758864638]
We show that our algorithm achieves optimal (in terms of terms of convex functions that we call $kappa$-2020) regret bounds for a large class of convex functions.
We also investigate the adaptation of our second-order bandit algorithm to online convex optimization with memory.
arXiv Detail & Related papers (2024-02-14T04:03:38Z) - Efficient Algorithms for Generalized Linear Bandits with Heavy-tailed
Rewards [40.99322897009357]
We propose two novel algorithms based on truncation and mean of medians.
Our truncation-based algorithm supports online learning, distinguishing it from existing truncation-based approaches.
Our algorithms improve the regret bounds by a logarithmic factor compared to existing algorithms when $epsilon=1$.
arXiv Detail & Related papers (2023-10-28T13:01:10Z) - Near-Optimal Algorithms for Private Online Optimization in the
Realizable Regime [74.52487417350221]
We consider online learning problems in the realizable setting, where there is a zero-loss solution.
We propose new Differentially Private (DP) algorithms that obtain near-optimal regret bounds.
arXiv Detail & Related papers (2023-02-27T21:19:24Z) - Refined Regret for Adversarial MDPs with Linear Function Approximation [50.00022394876222]
We consider learning in an adversarial Decision Process (MDP) where the loss functions can change arbitrarily over $K$ episodes.
This paper provides two algorithms that improve the regret to $tildemathcal O(K2/3)$ in the same setting.
arXiv Detail & Related papers (2023-01-30T14:37:21Z) - Parameter-free Regret in High Probability with Heavy Tails [45.11667443216861]
We present new algorithms for online convex optimization over unbounded domains.
We obtain parameter-free regret in high-probability given access only to potentially heavy-tailed subgradient estimates.
arXiv Detail & Related papers (2022-10-25T21:43:44Z) - Online Learning with Bounded Recall [11.046741824529107]
We study the problem of full-information online learning in the "bounded recall" setting popular in the study of repeated games.
An online learning algorithm $mathcalA$ is $M$-$textitbounded-recall$ if its output at time $t$ can be written as a function of the $M$ previous rewards.
arXiv Detail & Related papers (2022-05-28T20:52:52Z) - Logarithmic Regret from Sublinear Hints [76.87432703516942]
We show that an algorithm can obtain $O(log T)$ regret with just $O(sqrtT)$ hints under a natural query model.
We also show that $o(sqrtT)$ hints cannot guarantee better than $Omega(sqrtT)$ regret.
arXiv Detail & Related papers (2021-11-09T16:50:18Z) - Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$
Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$
We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z) - Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure.
We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds.
We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z) - Adaptive Online Learning with Varying Norms [45.11667443216861]
We provide an online convex optimization algorithm that outputs points $w_t$ in some domain $W$.
We apply this result to obtain new "full-matrix"-style regret bounds.
arXiv Detail & Related papers (2020-02-10T17:22:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.