A Meta-Learning Control Algorithm with Provable Finite-Time Guarantees
- URL: http://arxiv.org/abs/2008.13265v6
- Date: Fri, 4 Feb 2022 02:01:30 GMT
- Title: A Meta-Learning Control Algorithm with Provable Finite-Time Guarantees
- Authors: Deepan Muthirayan and Pramod Khargonekar
- Abstract summary: We provide provable regret guarantees for an online meta-learning control algorithm in an iterative control setting.
We show that the worst regret for the learning within an iteration continuously improves with experience of more iterations.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work we provide provable regret guarantees for an online
meta-learning control algorithm in an iterative control setting, where in each
iteration the system to be controlled is a linear deterministic system that is
different and unknown, the cost for the controller in an iteration is a general
additive cost function and the control input is required to be constrained,
which if violated incurs an additional cost. We prove (i) that the algorithm
achieves a regret for the controller cost and constraint violation that are
$O(T^{3/4})$ for an episode of duration $T$ with respect to the best policy
that satisfies the control input control constraints and (ii) that the average
of the regret for the controller cost and constraint violation with respect to
the same policy vary as $O((1+\log(N)/N)T^{3/4})$ with the number of iterations
$N$, showing that the worst regret for the learning within an iteration
continuously improves with experience of more iterations.
Related papers
- Finite Time Regret Bounds for Minimum Variance Control of Autoregressive
Systems with Exogenous Inputs [10.304902889192071]
A key challenge experienced by many adaptive controllers is their poor empirical performance in the initial stages of learning.
We present a modified version of the Certainty Equivalence (CE) adaptive controller, which utilizes probing inputs for exploration.
We show that it has a $C log T$ bound on the regret after $T$ time-steps for bounded noise, and $Clog2 T$ in the case of sub-Gaussian noise.
arXiv Detail & Related papers (2023-05-26T14:29:33Z) - Safe Adaptive Learning-based Control for Constrained Linear Quadratic
Regulators with Regret Guarantees [11.627320138064684]
We study the adaptive control of an unknown linear system with a quadratic cost function subject to safety constraints on both the states and actions.
Our algorithm is implemented on a single trajectory and does not require system restarts.
arXiv Detail & Related papers (2021-10-31T05:52:42Z) - Finite-time System Identification and Adaptive Control in Autoregressive
Exogenous Systems [79.67879934935661]
We study the problem of system identification and adaptive control of unknown ARX systems.
We provide finite-time learning guarantees for the ARX systems under both open-loop and closed-loop data collection.
arXiv Detail & Related papers (2021-08-26T18:00:00Z) - Regret-optimal Estimation and Control [52.28457815067461]
We show that the regret-optimal estimator and regret-optimal controller can be derived in state-space form.
We propose regret-optimal analogs of Model-Predictive Control (MPC) and the Extended KalmanFilter (EKF) for systems with nonlinear dynamics.
arXiv Detail & Related papers (2021-06-22T23:14:21Z) - Non-stationary Online Learning with Memory and Non-stochastic Control [71.14503310914799]
We study the problem of Online Convex Optimization (OCO) with memory, which allows loss functions to depend on past decisions.
In this paper, we introduce dynamic policy regret as the performance measure to design algorithms robust to non-stationary environments.
We propose a novel algorithm for OCO with memory that provably enjoys an optimal dynamic policy regret in terms of time horizon, non-stationarity measure, and memory length.
arXiv Detail & Related papers (2021-02-07T09:45:15Z) - Meta-Learning Guarantees for Online Receding Horizon Learning Control [0.0]
We provide provable regret guarantees for an online meta-learning receding horizon control algorithm in an iterative control setting.
We show that the worst regret for learning within an iteration improves with experience of more iterations.
arXiv Detail & Related papers (2020-10-21T21:57:04Z) - Safety-Critical Online Control with Adversarial Disturbances [8.633140051496408]
We seek to synthesize state-feedback controllers to minimize a cost incurred due to the disturbance.
We consider an online setting where costs at each time are revealed only after the controller at that time is chosen.
We show that the regret function, which is defined as the difference between these costs, varies logarithmically with the time horizon.
arXiv Detail & Related papers (2020-09-20T19:59:15Z) - Learning Stabilizing Controllers for Unstable Linear Quadratic
Regulators from a Single Trajectory [85.29718245299341]
We study linear controllers under quadratic costs model also known as linear quadratic regulators (LQR)
We present two different semi-definite programs (SDP) which results in a controller that stabilizes all systems within an ellipsoid uncertainty set.
We propose an efficient data dependent algorithm -- textsceXploration -- that with high probability quickly identifies a stabilizing controller.
arXiv Detail & Related papers (2020-06-19T08:58:57Z) - Logarithmic Regret for Adversarial Online Control [56.12283443161479]
We give the first algorithm with logarithmic regret for arbitrary adversarial disturbance sequences.
Our algorithm and analysis use a characterization for the offline control law to reduce the online control problem to (delayed) online learning.
arXiv Detail & Related papers (2020-02-29T06:29:19Z) - Improper Learning for Non-Stochastic Control [78.65807250350755]
We consider the problem of controlling a possibly unknown linear dynamical system with adversarial perturbations, adversarially chosen convex loss functions, and partially observed states.
Applying online descent to this parametrization yields a new controller which attains sublinear regret vs. a large class of closed-loop policies.
Our bounds are the first in the non-stochastic control setting that compete with emphall stabilizing linear dynamical controllers.
arXiv Detail & Related papers (2020-01-25T02:12:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.