Implications of Regret on Stability of Linear Dynamical Systems
- URL: http://arxiv.org/abs/2211.07411v2
- Date: Sat, 15 Apr 2023 17:08:19 GMT
- Title: Implications of Regret on Stability of Linear Dynamical Systems
- Authors: Aren Karapetyan, Anastasios Tsiamis, Efe C. Balta, Andrea Iannelli,
John Lygeros
- Abstract summary: In online learning, the quality of an agent's decision is often quantified by the concept of regret.
We show that for linear state feedback policies and linear systems subject to adversarial disturbances, linear regret implies stability in both time-varying and time-invariant settings.
- Score: 5.6435410094272696
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The setting of an agent making decisions under uncertainty and under dynamic
constraints is common for the fields of optimal control, reinforcement
learning, and recently also for online learning. In the online learning
setting, the quality of an agent's decision is often quantified by the concept
of regret, comparing the performance of the chosen decisions to the best
possible ones in hindsight. While regret is a useful performance measure, when
dynamical systems are concerned, it is important to also assess the stability
of the closed-loop system for a chosen policy. In this work, we show that for
linear state feedback policies and linear systems subject to adversarial
disturbances, linear regret implies asymptotic stability in both time-varying
and time-invariant settings. Conversely, we also show that bounded input
bounded state stability and summability of the state transition matrices imply
linear regret.
Related papers
- Stability Bounds for Learning-Based Adaptive Control of Discrete-Time
Multi-Dimensional Stochastic Linear Systems with Input Constraints [3.8004168340068336]
We consider the problem of adaptive stabilization for discrete-time, multi-dimensional systems with bounded control input constraints and unbounded disturbances.
We propose a certainty-equivalent control scheme which combines online parameter estimation with saturated linear control.
arXiv Detail & Related papers (2023-04-02T16:38:13Z) - Best of Both Worlds in Online Control: Competitive Ratio and Policy
Regret [61.59646565655169]
We show that several recently proposed online control algorithms achieve the best of both worlds: sublinear regret vs. the best DAC policy selected in hindsight.
We conclude that sublinear regret vs. the optimal competitive policy is attainable when the linear dynamical system is unknown.
arXiv Detail & Related papers (2022-11-21T07:29:08Z) - Regret Analysis of Certainty Equivalence Policies in Continuous-Time
Linear-Quadratic Systems [0.0]
This work studies theoretical performance guarantees of a ubiquitous reinforcement learning policy for controlling the canonical model of linear-quadratic system.
We establish square-root of time regret bounds, indicating that randomized certainty equivalent policy learns optimal control actions fast from a single state trajectory.
arXiv Detail & Related papers (2022-06-09T11:47:36Z) - KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed
Stability in Nonlinear Dynamical Systems [66.9461097311667]
We propose a model-based reinforcement learning framework with formal stability guarantees.
The proposed method learns the system dynamics up to a confidence interval using feature representation.
We show that KCRL is guaranteed to learn a stabilizing policy in a finite number of interactions with the underlying unknown system.
arXiv Detail & Related papers (2022-06-03T17:27:04Z) - Online Control of Unknown Time-Varying Dynamical Systems [48.75672260851758]
We study online control of time-varying linear systems with unknown dynamics in the nonstochastic control model.
We study regret bounds with respect to common classes of policies: Disturbance Action (SLS), Disturbance Response (Youla), and linear feedback policies.
arXiv Detail & Related papers (2022-02-16T06:57:14Z) - Bayesian Algorithms Learn to Stabilize Unknown Continuous-Time Systems [0.0]
Linear dynamical systems are canonical models for learning-based control of plants with uncertain dynamics.
A reliable stabilization procedure for this purpose that can effectively learn from unstable data to stabilize the system in a finite time is not currently available.
In this work, we propose a novel learning algorithm that stabilizes unknown continuous-time linear systems.
arXiv Detail & Related papers (2021-12-30T15:31:35Z) - Reinforcement Learning Policies in Continuous-Time Linear Systems [0.0]
We present online policies that learn optimal actions fast by carefully randomizing the parameter estimates.
We prove sharp stability results for inexact system dynamics and tightly specify the infinitesimal regret caused by sub-optimal actions.
Our analysis sheds light on fundamental challenges in continuous-time reinforcement learning and suggests a useful cornerstone for similar problems.
arXiv Detail & Related papers (2021-09-16T00:08:50Z) - Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design.
We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z) - Non-stationary Online Learning with Memory and Non-stochastic Control [71.14503310914799]
We study the problem of Online Convex Optimization (OCO) with memory, which allows loss functions to depend on past decisions.
In this paper, we introduce dynamic policy regret as the performance measure to design algorithms robust to non-stationary environments.
We propose a novel algorithm for OCO with memory that provably enjoys an optimal dynamic policy regret in terms of time horizon, non-stationarity measure, and memory length.
arXiv Detail & Related papers (2021-02-07T09:45:15Z) - Efficient Empowerment Estimation for Unsupervised Stabilization [75.32013242448151]
empowerment principle enables unsupervised stabilization of dynamical systems at upright positions.
We propose an alternative solution based on a trainable representation of a dynamical system as a Gaussian channel.
We show that our method has a lower sample complexity, is more stable in training, possesses the essential properties of the empowerment function, and allows estimation of empowerment from images.
arXiv Detail & Related papers (2020-07-14T21:10:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.