Logarithmic Regret Bound in Partially Observable Linear Dynamical
Systems
- URL: http://arxiv.org/abs/2003.11227v2
- Date: Wed, 24 Jun 2020 02:00:33 GMT
- Title: Logarithmic Regret Bound in Partially Observable Linear Dynamical
Systems
- Authors: Sahin Lale, Kamyar Azizzadenesheli, Babak Hassibi, Anima Anandkumar
- Abstract summary: We study the problem of system identification and adaptive control in partially observable linear dynamical systems.
We present the first model estimation method with finite-time guarantees in both open and closed-loop system identification.
We show that AdaptOn is the first algorithm that achieves $textpolylogleft(Tright)$ regret in adaptive control of unknown partially observable linear dynamical systems.
- Score: 91.43582419264763
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of system identification and adaptive control in
partially observable linear dynamical systems. Adaptive and closed-loop system
identification is a challenging problem due to correlations introduced in data
collection. In this paper, we present the first model estimation method with
finite-time guarantees in both open and closed-loop system identification.
Deploying this estimation method, we propose adaptive control online learning
(AdaptOn), an efficient reinforcement learning algorithm that adaptively learns
the system dynamics and continuously updates its controller through online
learning steps. AdaptOn estimates the model dynamics by occasionally solving a
linear regression problem through interactions with the environment. Using
policy re-parameterization and the estimated model, AdaptOn constructs
counterfactual loss functions to be used for updating the controller through
online gradient descent. Over time, AdaptOn improves its model estimates and
obtains more accurate gradient updates to improve the controller. We show that
AdaptOn achieves a regret upper bound of $\text{polylog}\left(T\right)$, after
$T$ time steps of agent-environment interaction. To the best of our knowledge,
AdaptOn is the first algorithm that achieves $\text{polylog}\left(T\right)$
regret in adaptive control of unknown partially observable linear dynamical
systems which includes linear quadratic Gaussian (LQG) control.
Related papers
- Learning Residual Model of Model Predictive Control via Random Forests
for Autonomous Driving [13.865293598486492]
One major issue in predictive control (MPC) for autonomous driving is the contradiction between the system model's prediction and computation.
This paper reformulates the MPC tracking accuracy as a program (QP) problem optimization as a program (QP) can effectively solve it.
arXiv Detail & Related papers (2023-04-10T03:32:09Z) - Learning Adaptive Control for SE(3) Hamiltonian Dynamics [15.26733033527393]
This paper develops adaptive geometric control for rigid-body systems, such as ground, aerial, and underwater vehicles.
We learn a Hamiltonian model of the system dynamics using a neural ordinary differential equation network trained from state-control trajectory data.
In the second stage, we design a trajectory tracking controller with disturbance compensation from an energy-based perspective.
arXiv Detail & Related papers (2021-09-21T05:54:28Z) - Finite-time System Identification and Adaptive Control in Autoregressive
Exogenous Systems [79.67879934935661]
We study the problem of system identification and adaptive control of unknown ARX systems.
We provide finite-time learning guarantees for the ARX systems under both open-loop and closed-loop data collection.
arXiv Detail & Related papers (2021-08-26T18:00:00Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Meta Learning MPC using Finite-Dimensional Gaussian Process
Approximations [0.9539495585692008]
Two key factors that hinder the practical applicability of learning methods in control are their high computational complexity and limited generalization capabilities to unseen conditions.
This paper makes use of a meta-learning approach for adaptive model predictive control, by learning a system model that leverages data from previous related tasks.
arXiv Detail & Related papers (2020-08-13T15:59:38Z) - Anticipating the Long-Term Effect of Online Learning in Control [75.6527644813815]
AntLer is a design algorithm for learning-based control laws that anticipates learning.
We show that AntLer approximates an optimal solution arbitrarily accurately with probability one.
arXiv Detail & Related papers (2020-07-24T07:00:14Z) - Tracking Performance of Online Stochastic Learners [57.14673504239551]
Online algorithms are popular in large-scale learning settings due to their ability to compute updates on the fly, without the need to store and process data in large batches.
When a constant step-size is used, these algorithms also have the ability to adapt to drifts in problem parameters, such as data or model properties, and track the optimal solution with reasonable accuracy.
We establish a link between steady-state performance derived under stationarity assumptions and the tracking performance of online learners under random walk models.
arXiv Detail & Related papers (2020-04-04T14:16:27Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.