Meta-Learning Guarantees for Online Receding Horizon Learning Control
- URL: http://arxiv.org/abs/2010.11327v14
- Date: Thu, 18 Feb 2021 18:55:11 GMT
- Title: Meta-Learning Guarantees for Online Receding Horizon Learning Control
- Authors: Deepan Muthirayan, Pramod P. Khargonekar
- Abstract summary: We provide provable regret guarantees for an online meta-learning receding horizon control algorithm in an iterative control setting.
We show that the worst regret for learning within an iteration improves with experience of more iterations.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we provide provable regret guarantees for an online
meta-learning receding horizon control algorithm in an iterative control
setting. We consider the setting where, in each iteration the system to be
controlled is a linear deterministic system that is different and unknown, the
cost for the controller in an iteration is a general additive cost function and
there are affine control input constraints. By analysing conditions under which
sub-linear regret is achievable, we prove that the meta-learning online
receding horizon controller achieves an average of the dynamic regret for the
controller cost that is $\tilde{O}((1+1/\sqrt{N})T^{3/4})$ with the number of
iterations $N$. Thus, we show that the worst regret for learning within an
iteration improves with experience of more iterations, with guarantee on rate
of improvement.
Related papers
- Learning Decentralized Linear Quadratic Regulators with $\sqrt{T}$ Regret [1.529943343419486]
We propose an online learning algorithm that adaptively designs a decentralized linear quadratic regulator when the system model is unknown a priori.
We show that our controller enjoys an expected regret that scales as $sqrtT$ with the time horizon $T$ for the case of partially nested information pattern.
arXiv Detail & Related papers (2022-10-17T09:29:01Z) - Improving the Performance of Robust Control through Event-Triggered
Learning [74.57758188038375]
We propose an event-triggered learning algorithm that decides when to learn in the face of uncertainty in the LQR problem.
We demonstrate improved performance over a robust controller baseline in a numerical example.
arXiv Detail & Related papers (2022-07-28T17:36:37Z) - Finite-time System Identification and Adaptive Control in Autoregressive
Exogenous Systems [79.67879934935661]
We study the problem of system identification and adaptive control of unknown ARX systems.
We provide finite-time learning guarantees for the ARX systems under both open-loop and closed-loop data collection.
arXiv Detail & Related papers (2021-08-26T18:00:00Z) - Regret Analysis of Learning-Based MPC with Partially-Unknown Cost
Function [5.601217969637838]
exploration/exploitation trade-off is an inherent challenge in data-driven and adaptive control.
We propose the use of a finitehorizon oracle controller with perfect knowledge of all system parameters as a reference for optimal control actions.
We develop learning-based policies that we prove achieve low regret with respect to this oracle finite-horizon controller.
arXiv Detail & Related papers (2021-08-04T22:43:51Z) - Non-stationary Online Learning with Memory and Non-stochastic Control [71.14503310914799]
We study the problem of Online Convex Optimization (OCO) with memory, which allows loss functions to depend on past decisions.
In this paper, we introduce dynamic policy regret as the performance measure to design algorithms robust to non-stationary environments.
We propose a novel algorithm for OCO with memory that provably enjoys an optimal dynamic policy regret in terms of time horizon, non-stationarity measure, and memory length.
arXiv Detail & Related papers (2021-02-07T09:45:15Z) - A Meta-Learning Control Algorithm with Provable Finite-Time Guarantees [0.0]
We provide provable regret guarantees for an online meta-learning control algorithm in an iterative control setting.
We show that the worst regret for the learning within an iteration continuously improves with experience of more iterations.
arXiv Detail & Related papers (2020-08-30T20:30:40Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z) - Logarithmic Regret for Adversarial Online Control [56.12283443161479]
We give the first algorithm with logarithmic regret for arbitrary adversarial disturbance sequences.
Our algorithm and analysis use a characterization for the offline control law to reduce the online control problem to (delayed) online learning.
arXiv Detail & Related papers (2020-02-29T06:29:19Z) - Improper Learning for Non-Stochastic Control [78.65807250350755]
We consider the problem of controlling a possibly unknown linear dynamical system with adversarial perturbations, adversarially chosen convex loss functions, and partially observed states.
Applying online descent to this parametrization yields a new controller which attains sublinear regret vs. a large class of closed-loop policies.
Our bounds are the first in the non-stochastic control setting that compete with emphall stabilizing linear dynamical controllers.
arXiv Detail & Related papers (2020-01-25T02:12:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.