Regret-Optimal LQR Control
- URL: http://arxiv.org/abs/2105.01244v2
- Date: Thu, 13 Apr 2023 07:14:26 GMT
- Title: Regret-Optimal LQR Control
- Authors: Oron Sabag and Gautam Goel and Sahin Lale and Babak Hassibi
- Abstract summary: We find a causal controller that minimizes the worst-case regret over all bounded energy disturbances.
We derive explicit formulas for the optimal regret and for the regret-optimal controller for the state-space setting.
The regret-optimal controller presents itself as a viable option for control systems design.
- Score: 37.99652162611661
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the infinite-horizon LQR control problem. Motivated by
competitive analysis in online learning, as a criterion for controller design
we introduce the dynamic regret, defined as the difference between the LQR cost
of a causal controller (that has only access to past disturbances) and the LQR
cost of the \emph{unique} clairvoyant one (that has also access to future
disturbances) that is known to dominate all other controllers. The regret
itself is a function of the disturbances, and we propose to find a causal
controller that minimizes the worst-case regret over all bounded energy
disturbances. The resulting controller has the interpretation of guaranteeing
the smallest regret compared to the best non-causal controller that can see the
future. We derive explicit formulas for the optimal regret and for the
regret-optimal controller for the state-space setting. These explicit solutions
are obtained by showing that the regret-optimal control problem can be reduced
to a Nehari extension problem that can be solved explicitly. The regret-optimal
controller is shown to be linear and can be expressed as the sum of the
classical $H_2$ state-feedback law and an $n$-th order controller ($n$ is the
state dimension), and its construction simply requires a solution to the
standard LQR Riccati equation and two Lyapunov equations. Simulations over a
range of plants demonstrate that the regret-optimal controller interpolates
nicely between the $H_2$ and the $H_\infty$ optimal controllers, and generally
has $H_2$ and $H_\infty$ costs that are simultaneously close to their optimal
values. The regret-optimal controller thus presents itself as a viable option
for control systems design.
Related papers
- Finite Time Regret Bounds for Minimum Variance Control of Autoregressive
Systems with Exogenous Inputs [10.304902889192071]
A key challenge experienced by many adaptive controllers is their poor empirical performance in the initial stages of learning.
We present a modified version of the Certainty Equivalence (CE) adaptive controller, which utilizes probing inputs for exploration.
We show that it has a $C log T$ bound on the regret after $T$ time-steps for bounded noise, and $Clog2 T$ in the case of sub-Gaussian noise.
arXiv Detail & Related papers (2023-05-26T14:29:33Z) - Finite-time System Identification and Adaptive Control in Autoregressive
Exogenous Systems [79.67879934935661]
We study the problem of system identification and adaptive control of unknown ARX systems.
We provide finite-time learning guarantees for the ARX systems under both open-loop and closed-loop data collection.
arXiv Detail & Related papers (2021-08-26T18:00:00Z) - Competitive Control [52.28457815067461]
We focus on designing an online controller which competes against a clairvoyant offline optimal controller.
A natural performance metric in this setting is competitive ratio, which is the ratio between the cost incurred by the online controller and the cost incurred by the offline optimal controller.
arXiv Detail & Related papers (2021-07-28T22:26:27Z) - Regret-optimal Estimation and Control [52.28457815067461]
We show that the regret-optimal estimator and regret-optimal controller can be derived in state-space form.
We propose regret-optimal analogs of Model-Predictive Control (MPC) and the Extended KalmanFilter (EKF) for systems with nonlinear dynamics.
arXiv Detail & Related papers (2021-06-22T23:14:21Z) - Regret-optimal measurement-feedback control [39.76359052907755]
We consider measurement-feedback control in linear dynamical systems from the perspective of regret.
We show that in the measurement-feedback setting, unlike in the full information setting, there is no single offline controller which outperforms every other offline controller on every disturbance.
We show that the corresponding regret-optimal online controller can be found via a novel reduction to the classical Nehari problem and present a tight data-dependent bound on its regret.
arXiv Detail & Related papers (2020-11-24T01:36:48Z) - Meta-Learning Guarantees for Online Receding Horizon Learning Control [0.0]
We provide provable regret guarantees for an online meta-learning receding horizon control algorithm in an iterative control setting.
We show that the worst regret for learning within an iteration improves with experience of more iterations.
arXiv Detail & Related papers (2020-10-21T21:57:04Z) - Regret-optimal control in dynamic environments [39.76359052907755]
We focus on the problem of designing an online controller which minimizes regret against the best dynamic sequence of control actions selected in hindsight.
We derive the state-space structure of the regret-optimal controller via a novel reduction to $H_infty$ control.
We present numerical experiments which show that our regret-optimal controller interpolates between the performance of the $H_infty$-optimal controllers across and adversarial environments.
arXiv Detail & Related papers (2020-10-20T17:32:17Z) - Naive Exploration is Optimal for Online LQR [49.681825576239355]
We show that the optimal regret scales as $widetildeTheta(sqrtd_mathbfu2 d_mathbfx T)$, where $T$ is the number of time steps, $d_mathbfu$ is the dimension of the input space, and $d_mathbfx$ is the dimension of the system state.
Our lower bounds rule out the possibility of a $mathrmpoly(logT)$-regret algorithm, which had been
arXiv Detail & Related papers (2020-01-27T03:44:54Z) - Improper Learning for Non-Stochastic Control [78.65807250350755]
We consider the problem of controlling a possibly unknown linear dynamical system with adversarial perturbations, adversarially chosen convex loss functions, and partially observed states.
Applying online descent to this parametrization yields a new controller which attains sublinear regret vs. a large class of closed-loop policies.
Our bounds are the first in the non-stochastic control setting that compete with emphall stabilizing linear dynamical controllers.
arXiv Detail & Related papers (2020-01-25T02:12:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.