Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting
- URL: http://arxiv.org/abs/2003.05999v2
- Date: Wed, 24 Jun 2020 02:33:00 GMT
- Title: Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting
- Authors: Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, Anima Anandkumar
- Abstract summary: We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
- Score: 91.43582419264763
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of adaptive control in partially observable linear
quadratic Gaussian control systems, where the model dynamics are unknown a
priori. We propose LqgOpt, a novel reinforcement learning algorithm based on
the principle of optimism in the face of uncertainty, to effectively minimize
the overall control cost. We employ the predictor state evolution
representation of the system dynamics and deploy a recently proposed
closed-loop system identification method, estimation, and confidence bound
construction. LqgOpt efficiently explores the system dynamics, estimates the
model parameters up to their confidence interval, and deploys the controller of
the most optimistic model for further exploration and exploitation. We provide
stability guarantees for LqgOpt and prove the regret upper bound of
$\tilde{\mathcal{O}}(\sqrt{T})$ for adaptive control of linear quadratic
Gaussian (LQG) systems, where $T$ is the time horizon of the problem.
Related papers
- Sublinear Regret for a Class of Continuous-Time Linear--Quadratic Reinforcement Learning Problems [10.404992912881601]
We study reinforcement learning for a class of continuous-time linear-quadratic (LQ) control problems for diffusions.
We apply a model-free approach that relies neither on knowledge of model parameters nor on their estimations, and devise an actor-critic algorithm to learn the optimal policy parameter directly.
arXiv Detail & Related papers (2024-07-24T12:26:21Z) - Sub-linear Regret in Adaptive Model Predictive Control [56.705978425244496]
We present STT-MPC (Self-Tuning Tube-based Model Predictive Control), an online oracle that combines the certainty-equivalence principle and polytopic tubes.
We analyze the regret of the algorithm, when compared to an algorithm initially aware of the system dynamics.
arXiv Detail & Related papers (2023-10-07T15:07:10Z) - LQGNet: Hybrid Model-Based and Data-Driven Linear Quadratic Stochastic
Control [24.413595920205907]
quadratic control deals with finding an optimal control signal for a dynamical system in a setting with uncertainty.
LQGNet is a controller that leverages data to operate under partially known dynamics.
We show that LQGNet outperforms classic control by overcoming mismatched SS models.
arXiv Detail & Related papers (2022-10-23T17:59:51Z) - Finite-time System Identification and Adaptive Control in Autoregressive
Exogenous Systems [79.67879934935661]
We study the problem of system identification and adaptive control of unknown ARX systems.
We provide finite-time learning guarantees for the ARX systems under both open-loop and closed-loop data collection.
arXiv Detail & Related papers (2021-08-26T18:00:00Z) - Regret Analysis of Learning-Based MPC with Partially-Unknown Cost
Function [5.601217969637838]
exploration/exploitation trade-off is an inherent challenge in data-driven and adaptive control.
We propose the use of a finitehorizon oracle controller with perfect knowledge of all system parameters as a reference for optimal control actions.
We develop learning-based policies that we prove achieve low regret with respect to this oracle finite-horizon controller.
arXiv Detail & Related papers (2021-08-04T22:43:51Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Logarithmic Regret Bound in Partially Observable Linear Dynamical
Systems [91.43582419264763]
We study the problem of system identification and adaptive control in partially observable linear dynamical systems.
We present the first model estimation method with finite-time guarantees in both open and closed-loop system identification.
We show that AdaptOn is the first algorithm that achieves $textpolylogleft(Tright)$ regret in adaptive control of unknown partially observable linear dynamical systems.
arXiv Detail & Related papers (2020-03-25T06:00:33Z) - Improper Learning for Non-Stochastic Control [78.65807250350755]
We consider the problem of controlling a possibly unknown linear dynamical system with adversarial perturbations, adversarially chosen convex loss functions, and partially observed states.
Applying online descent to this parametrization yields a new controller which attains sublinear regret vs. a large class of closed-loop policies.
Our bounds are the first in the non-stochastic control setting that compete with emphall stabilizing linear dynamical controllers.
arXiv Detail & Related papers (2020-01-25T02:12:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.