Strongly Adaptive OCO with Memory
- URL: http://arxiv.org/abs/2102.01623v1
- Date: Tue, 2 Feb 2021 17:26:08 GMT
- Title: Strongly Adaptive OCO with Memory
- Authors: Zhiyu Zhang, Ashok Cutkosky, Ioannis Ch. Paschalidis
- Abstract summary: We propose the first strongly adaptive algorithm for online learning with memory.
Our algorithm results in a strongly adaptive regret bound for the control of linear time-varying systems.
- Score: 49.319621885036035
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent progress in online control has popularized online learning with
memory, a variant of the standard online learning problem with loss functions
dependent on the prediction history. In this paper, we propose the first
strongly adaptive algorithm for this problem: on any interval
$\mathcal{I}\subset[1:T]$, the proposed algorithm achieves $\tilde
O\left(\sqrt{|\mathcal{I}|}\right)$ policy regret against the best fixed
comparator for that interval. Combined with online control techniques, our
algorithm results in a strongly adaptive regret bound for the control of linear
time-varying systems.
Related papers
- Efficient Methods for Non-stationary Online Learning [67.3300478545554]
We present efficient methods for optimizing dynamic regret and adaptive regret, which reduce the number of projections per round from $mathcalO(log T)$ to $1$.
Our technique hinges on the reduction mechanism developed in parameter-free online learning and requires non-trivial twists on non-stationary online methods.
arXiv Detail & Related papers (2023-09-16T07:30:12Z) - Efficient Online Learning with Memory via Frank-Wolfe Optimization:
Algorithms with Bounded Dynamic Regret and Applications to Control [15.588080817106563]
We introduce the first projection-free meta-base learning algorithm with memory that minimizes dynamic regret.
We are motivated by artificial intelligence applications where autonomous agents need to adapt to time-varying environments.
arXiv Detail & Related papers (2023-01-02T01:12:29Z) - Differentially Private Online-to-Batch for Smooth Losses [38.23708749658059]
We develop a new reduction that converts any online convex optimization algorithm suffering $O(sqrtT)$ regret into an $epsilon$-differentially private convex algorithm with the optimal convergence rate $tilde O(sqrtT + sqrtd/epsilon T)$ on smooth losses in linear time.
arXiv Detail & Related papers (2022-10-12T21:13:31Z) - Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization [70.4342220499858]
We introduce novel online algorithms that can exploit smoothness and replace the dependence on $T$ in dynamic regret with problem-dependent quantities.
Our results are adaptive to the intrinsic difficulty of the problem, since the bounds are tighter than existing results for easy problems and safeguard the same rate in the worst case.
arXiv Detail & Related papers (2021-12-29T02:42:59Z) - Dynamic Regret for Strongly Adaptive Methods and Optimality of Online
KRR [13.165557713537389]
We show that Strongly Adaptive (SA) algorithms can be viewed as a principled way of controlling dynamic regret.
We derive a new lower bound on a certain penalized regret which establishes the near minimax optimality of online Kernel Ridge Regression (KRR)
arXiv Detail & Related papers (2021-11-22T21:52:47Z) - Online estimation and control with optimal pathlength regret [52.28457815067461]
A natural goal when designing online learning algorithms is to bound the regret of the algorithm in terms of the temporal variation of the input sequence.
Data-dependent "pathlength" regret bounds have recently been obtained for a wide variety of online learning problems, including OCO and bandits.
arXiv Detail & Related papers (2021-10-24T22:43:15Z) - Non-stationary Online Learning with Memory and Non-stochastic Control [71.14503310914799]
We study the problem of Online Convex Optimization (OCO) with memory, which allows loss functions to depend on past decisions.
In this paper, we introduce dynamic policy regret as the performance measure to design algorithms robust to non-stationary environments.
We propose a novel algorithm for OCO with memory that provably enjoys an optimal dynamic policy regret in terms of time horizon, non-stationarity measure, and memory length.
arXiv Detail & Related papers (2021-02-07T09:45:15Z) - Logarithmic Regret for Adversarial Online Control [56.12283443161479]
We give the first algorithm with logarithmic regret for arbitrary adversarial disturbance sequences.
Our algorithm and analysis use a characterization for the offline control law to reduce the online control problem to (delayed) online learning.
arXiv Detail & Related papers (2020-02-29T06:29:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.