Related papers: An Equivalence Between Static and Dynamic Regret Minimization

An Equivalence Between Static and Dynamic Regret Minimization

URL: http://arxiv.org/abs/2406.01577v2
Date: Fri, 01 Nov 2024 21:56:21 GMT
Title: An Equivalence Between Static and Dynamic Regret Minimization
Authors: Andrew Jacobsen, Francesco Orabona,
Abstract summary: We show that for linear losses, dynamic regret minimization is equivalent to static regret minimization in an extended decision space. We provide an algorithm guaranteeing dynamic regret of the form $R_T(u_1,dots,u_T)le tilde.
Score: 10.812831455376218
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study the problem of dynamic regret minimization in online convex optimization, in which the objective is to minimize the difference between the cumulative loss of an algorithm and that of an arbitrary sequence of comparators. While the literature on this topic is very rich, a unifying framework for the analysis and design of these algorithms is still missing. In this paper we show that for linear losses, dynamic regret minimization is equivalent to static regret minimization in an extended decision space. Using this simple observation, we show that there is a frontier of lower bounds trading off penalties due to the variance of the losses and penalties due to variability of the comparator sequence, and provide a framework for achieving any of the guarantees along this frontier. As a result, we also prove for the first time that adapting to the squared path-length of an arbitrary sequence of comparators to achieve regret $R_{T}(u_{1},\dots,u_{T})\le O(\sqrt{T\sum_{t} \|u_{t}-u_{t+1}\|^{2}})$ is impossible. However, using our framework we introduce an alternative notion of variability based on a locally-smoothed comparator sequence $\bar u_{1}, \dots, \bar u_{T}$, and provide an algorithm guaranteeing dynamic regret of the form $R_{T}(u_{1},\dots,u_{T})\le \tilde O(\sqrt{T\sum_{i}\|\bar u_{i}-\bar u_{i+1}\|^{2}})$, while still matching in the worst case the usual path-length dependencies up to polylogarithmic terms.

Related papers

Dynamic Regret Reduces to Kernelized Static Regret [63.36965242404415]
We study dynamic regret in online convex optimization, where the objective is to achieve low cumulative loss relative to an arbitrary benchmark sequence.<n>By constructing a suitable function space in the form of a Reproducing Kernel Hilbert Space (RKHS), our reduction enables us to recover the optimal $R_T(u_1,ldots,u_T) = mathcalO(sqrtsum_t|u_t-u_t-u_t-1|T)$ dynamic regret guarantee.
arXiv Detail & Related papers (2025-07-07T21:09:33Z)
Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel. An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost. Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z)
Efficient Frameworks for Generalized Low-Rank Matrix Bandit Problems [61.85150061213987]
We study the generalized low-rank matrix bandit problem, proposed in citelu2021low under the Generalized Linear Model (GLM) framework. To overcome the computational infeasibility and theoretical restrain of existing algorithms, we first propose the G-ESTT framework. We show that G-ESTT can achieve the $tildeO(sqrt(d_1+d_2)3/2Mr3/2T)$ bound of regret while G-ESTS can achineve the $tildeO
arXiv Detail & Related papers (2024-01-14T14:14:19Z)
Non-stationary Online Convex Optimization with Arbitrary Delays [50.46856739179311]
This paper investigates the delayed online convex optimization (OCO) in non-stationary environments. We first propose a simple algorithm, namely DOGD, which performs a gradient descent step for each delayed gradient according to their arrival order. We develop an improved algorithm, which reduces those dynamic regret bounds achieved by DOGD to $O(sqrtbardT(P_T+1))$.
arXiv Detail & Related papers (2023-05-20T07:54:07Z)
Unconstrained Dynamic Regret via Sparse Coding [46.85145189210752]
Motivated by the challenge of nonstationarity in sequential decision making, we study Online Convex Optimization (OCO) under the coupling of two problem structures. This paper achieves a new type of these adaptive regret bounds via a sparse coding framework.
arXiv Detail & Related papers (2023-01-31T00:52:14Z)
Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization [70.4342220499858]
We introduce novel online algorithms that can exploit smoothness and replace the dependence on $T$ in dynamic regret with problem-dependent quantities. Our results are adaptive to the intrinsic difficulty of the problem, since the bounds are tighter than existing results for easy problems and safeguard the same rate in the worst case.
arXiv Detail & Related papers (2021-12-29T02:42:59Z)
Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary Dueling Bandits [27.279654173896372]
We study the problem of emphdynamic regret minimization in $K$-armed Dueling Bandits under non-stationary or time varying preferences. This is an online learning setup where the agent chooses a pair of items at each round and observes only a relative binary win-loss' feedback for this pair.
arXiv Detail & Related papers (2021-11-06T16:46:55Z)
Regret and Cumulative Constraint Violation Analysis for Online Convex Optimization with Long Term Constraints [24.97580261894342]
This paper considers online convex optimization with long term constraints, where constraints can be violated in intermediate rounds, but need to be satisfied in the long run. A novel algorithm is first proposed and it achieves an $mathcalO(Tmaxc,1-c)$ bound for static regret and an $mathcalO(T(1-c)/2)$ bound for cumulative constraint violation.
arXiv Detail & Related papers (2021-06-09T15:18:06Z)
Dynamic Regret of Convex and Smooth Functions [93.71361250701075]
We investigate online convex optimization in non-stationary environments. We choose the dynamic regret as the performance measure. We show that it is possible to further enhance the dynamic regret by exploiting the smoothness condition.
arXiv Detail & Related papers (2020-07-07T14:10:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.