Online estimation and control with optimal pathlength regret
- URL: http://arxiv.org/abs/2110.12544v1
- Date: Sun, 24 Oct 2021 22:43:15 GMT
- Title: Online estimation and control with optimal pathlength regret
- Authors: Gautam Goel, Babak Hassibi
- Abstract summary: A natural goal when designing online learning algorithms is to bound the regret of the algorithm in terms of the temporal variation of the input sequence.
Data-dependent "pathlength" regret bounds have recently been obtained for a wide variety of online learning problems, including OCO and bandits.
- Score: 52.28457815067461
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A natural goal when designing online learning algorithms for non-stationary
environments is to bound the regret of the algorithm in terms of the temporal
variation of the input sequence. Intuitively, when the variation is small, it
should be easier for the algorithm to achieve low regret, since past
observations are predictive of future inputs. Such data-dependent "pathlength"
regret bounds have recently been obtained for a wide variety of online learning
problems, including OCO and bandits. We obtain the first pathlength regret
bounds for online control and estimation (e.g. Kalman filtering) in linear
dynamical systems. The key idea in our derivation is to reduce
pathlength-optimal filtering and control to certain variational problems in
robust estimation and control; these reductions may be of independent interest.
Numerical simulations confirm that our pathlength-optimal algorithms outperform
traditional $H_2$ and $H_{\infty}$ algorithms when the environment varies over
time.
Related papers
- Improving Adaptive Online Learning Using Refined Discretization [44.646191058243645]
We study unconstrained Online Linear Optimization with Lipschitz losses.
Motivated by the pursuit of instance optimality, we propose a new algorithm.
Central to these results is a continuous time approach to online learning.
arXiv Detail & Related papers (2023-09-27T21:54:52Z) - Efficient Methods for Non-stationary Online Learning [67.3300478545554]
We present efficient methods for optimizing dynamic regret and adaptive regret, which reduce the number of projections per round from $mathcalO(log T)$ to $1$.
Our technique hinges on the reduction mechanism developed in parameter-free online learning and requires non-trivial twists on non-stationary online methods.
arXiv Detail & Related papers (2023-09-16T07:30:12Z) - Low-rank extended Kalman filtering for online learning of neural
networks from streaming data [71.97861600347959]
We propose an efficient online approximate Bayesian inference algorithm for estimating the parameters of a nonlinear function from a potentially non-stationary data stream.
The method is based on the extended Kalman filter (EKF), but uses a novel low-rank plus diagonal decomposition of the posterior matrix.
In contrast to methods based on variational inference, our method is fully deterministic, and does not require step-size tuning.
arXiv Detail & Related papers (2023-05-31T03:48:49Z) - Learning to Control under Time-Varying Environment [18.48729114775298]
This paper investigates the problem of regret in linear time-varying (LTV) dynamical systems.
We propose the first computationally tractable online algorithm with regret guarantees.
arXiv Detail & Related papers (2022-06-06T11:40:46Z) - Adaptivity and Non-stationarity: Problem-dependent Dynamic Regret for Online Convex Optimization [70.4342220499858]
We introduce novel online algorithms that can exploit smoothness and replace the dependence on $T$ in dynamic regret with problem-dependent quantities.
Our results are adaptive to the intrinsic difficulty of the problem, since the bounds are tighter than existing results for easy problems and safeguard the same rate in the worst case.
arXiv Detail & Related papers (2021-12-29T02:42:59Z) - Strongly Adaptive OCO with Memory [49.319621885036035]
We propose the first strongly adaptive algorithm for online learning with memory.
Our algorithm results in a strongly adaptive regret bound for the control of linear time-varying systems.
arXiv Detail & Related papers (2021-02-02T17:26:08Z) - Logarithmic Regret for Adversarial Online Control [56.12283443161479]
We give the first algorithm with logarithmic regret for arbitrary adversarial disturbance sequences.
Our algorithm and analysis use a characterization for the offline control law to reduce the online control problem to (delayed) online learning.
arXiv Detail & Related papers (2020-02-29T06:29:19Z) - Resolving learning rates adaptively by locating Stochastic Non-Negative
Associated Gradient Projection Points using line searches [0.0]
Learning rates in neural network training are currently determined a priori to training using expensive manual or automated tuning.
This study proposes gradient-only line searches to resolve the learning rate for neural network training algorithms.
arXiv Detail & Related papers (2020-01-15T03:08:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.