Related papers: Unconstrained Online Learning with Unbounded Losses

Unconstrained Online Learning with Unbounded Losses

URL: http://arxiv.org/abs/2306.04923v2
Date: Fri, 14 Jul 2023 20:44:43 GMT
Title: Unconstrained Online Learning with Unbounded Losses
Authors: Andrew Jacobsen, Ashok Cutkosky
Abstract summary: We develop a new setting for online learning with unbounded domains and non-Lipschitz losses. We provide an algorithm which guarantees $R_T(u)le tilde O(G|u|sqrtT+L|u|2sqrtT)$ regret on any problem where the subgradients satisfy $|g_t|le G+L|w_t|$.
Score: 45.801991005821804
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Algorithms for online learning typically require one or more boundedness assumptions: that the domain is bounded, that the losses are Lipschitz, or both. In this paper, we develop a new setting for online learning with unbounded domains and non-Lipschitz losses. For this setting we provide an algorithm which guarantees $R_{T}(u)\le \tilde O(G\|u\|\sqrt{T}+L\|u\|^{2}\sqrt{T})$ regret on any problem where the subgradients satisfy $\|g_{t}\|\le G+L\|w_{t}\|$, and show that this bound is unimprovable without further assumptions. We leverage this algorithm to develop new saddle-point optimization algorithms that converge in duality gap in unbounded domains, even in the absence of meaningful curvature. Finally, we provide the first algorithm achieving non-trivial dynamic regret in an unbounded domain for non-Lipschitz losses, as well as a matching lower bound. The regret of our dynamic regret algorithm automatically improves to a novel $L^{*}$ bound when the losses are smooth.

Related papers

Unconstrained Robust Online Convex Optimization [38.9652176513385]
We focus on the difficult unconstrained'' setting in which our algorithm must maintain low regret.<n>Our algorithms guarantee regret $ |u|G (sqrtT + k) $ when $G ge max_t |g_t|$ is known, where $k$ is a measure of the total amount of corruption.
arXiv Detail & Related papers (2025-06-15T09:21:15Z)
An Equivalence Between Static and Dynamic Regret Minimization [10.812831455376218]
We show that for linear losses, dynamic regret minimization is equivalent to static regret minimization in an extended decision space. We provide an algorithm guaranteeing dynamic regret of the form $R_T(u_1,dots,u_T)le tilde.
arXiv Detail & Related papers (2024-06-03T17:54:58Z)
Second Order Methods for Bandit Optimization and Control [34.51425758864638]
We show that our algorithm achieves optimal (in terms of terms of convex functions that we call $kappa$-2020) regret bounds for a large class of convex functions. We also investigate the adaptation of our second-order bandit algorithm to online convex optimization with memory.
arXiv Detail & Related papers (2024-02-14T04:03:38Z)
Efficient Algorithms for Generalized Linear Bandits with Heavy-tailed Rewards [40.99322897009357]
We propose two novel algorithms based on truncation and mean of medians. Our truncation-based algorithm supports online learning, distinguishing it from existing truncation-based approaches. Our algorithms improve the regret bounds by a logarithmic factor compared to existing algorithms when $epsilon=1$.
arXiv Detail & Related papers (2023-10-28T13:01:10Z)
Near-Optimal Algorithms for Private Online Optimization in the Realizable Regime [74.52487417350221]
We consider online learning problems in the realizable setting, where there is a zero-loss solution. We propose new Differentially Private (DP) algorithms that obtain near-optimal regret bounds.
arXiv Detail & Related papers (2023-02-27T21:19:24Z)
Refined Regret for Adversarial MDPs with Linear Function Approximation [50.00022394876222]
We consider learning in an adversarial Decision Process (MDP) where the loss functions can change arbitrarily over $K$ episodes. This paper provides two algorithms that improve the regret to $tildemathcal O(K2/3)$ in the same setting.
arXiv Detail & Related papers (2023-01-30T14:37:21Z)
Parameter-free Regret in High Probability with Heavy Tails [45.11667443216861]
We present new algorithms for online convex optimization over unbounded domains. We obtain parameter-free regret in high-probability given access only to potentially heavy-tailed subgradient estimates.
arXiv Detail & Related papers (2022-10-25T21:43:44Z)
Online Learning with Bounded Recall [11.046741824529107]
We study the problem of full-information online learning in the "bounded recall" setting popular in the study of repeated games. An online learning algorithm $mathcalA$ is $M$-$textitbounded-recall$ if its output at time $t$ can be written as a function of the $M$ previous rewards.
arXiv Detail & Related papers (2022-05-28T20:52:52Z)
Logarithmic Regret from Sublinear Hints [76.87432703516942]
We show that an algorithm can obtain $O(log T)$ regret with just $O(sqrtT)$ hints under a natural query model. We also show that $o(sqrtT)$ hints cannot guarantee better than $Omega(sqrtT)$ regret.
arXiv Detail & Related papers (2021-11-09T16:50:18Z)
Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$ Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$ We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z)
Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure. We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds. We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z)
Adaptive Online Learning with Varying Norms [45.11667443216861]
We provide an online convex optimization algorithm that outputs points $w_t$ in some domain $W$. We apply this result to obtain new "full-matrix"-style regret bounds.
arXiv Detail & Related papers (2020-02-10T17:22:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.