Regret Analysis of Online LQR Control via Trajectory Prediction and
Tracking: Extended Version
- URL: http://arxiv.org/abs/2302.10411v1
- Date: Tue, 21 Feb 2023 02:48:57 GMT
- Title: Regret Analysis of Online LQR Control via Trajectory Prediction and
Tracking: Extended Version
- Authors: Yitian Chen, Timothy L. Molloy, Tyler Summers, Iman Shames
- Abstract summary: We propose and analyze a new method for online linear quadratic regulator (LQR) control with a priori unknown time-varying cost matrices.
Our novel method involves using the available cost matrices to predict the optimal trajectory, and a tracking controller to drive the system towards it.
We show in simulations that our proposed method offers improved performance compared to other previously proposed online LQR methods.
- Score: 1.6344851071810074
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In this paper, we propose and analyze a new method for online linear
quadratic regulator (LQR) control with a priori unknown time-varying cost
matrices. The cost matrices are revealed sequentially with the potential for
future values to be previewed over a short window. Our novel method involves
using the available cost matrices to predict the optimal trajectory, and a
tracking controller to drive the system towards it. We adopted the notion of
dynamic regret to measure the performance of this proposed online LQR control
method, with our main result being that the (dynamic) regret of our method is
upper bounded by a constant. Moreover, the regret upper bound decays
exponentially with the preview window length, and is extendable to systems with
disturbances. We show in simulations that our proposed method offers improved
performance compared to other previously proposed online LQR methods.
Related papers
- Regret Analysis of Policy Optimization over Submanifolds for Linearly
Constrained Online LQG [12.201535821920624]
We study online linear quadratic Gaussian problems with a given linear constraint imposed on the controller.
We propose online optimistic Newton on manifold (OONM) which provides an online controller based on the prediction on the first and second order information of the function sequence.
arXiv Detail & Related papers (2024-03-13T14:06:18Z) - Online Nonsubmodular Minimization with Delayed Costs: From Full
Information to Bandit Feedback [98.7678704343537]
We focus on a class of nonsubmodular functions with special structure, and prove regret guarantees for several variants of the online and approximate online bandit gradient descent algorithms.
We derive bounds for the agent's regret in the full information and bandit feedback setting, even if the delay between choosing a decision and receiving the incurred cost is unbounded.
arXiv Detail & Related papers (2022-05-15T08:27:12Z) - Stochastic Online Linear Regression: the Forward Algorithm to Replace
Ridge [24.880035784304834]
We derive high probability regret bounds for online ridge regression and the forward algorithm.
This enables us to compare online regression algorithms more accurately and eliminate assumptions of bounded observations and predictions.
arXiv Detail & Related papers (2021-11-02T13:57:53Z) - Finite-time System Identification and Adaptive Control in Autoregressive
Exogenous Systems [79.67879934935661]
We study the problem of system identification and adaptive control of unknown ARX systems.
We provide finite-time learning guarantees for the ARX systems under both open-loop and closed-loop data collection.
arXiv Detail & Related papers (2021-08-26T18:00:00Z) - Regret-optimal Estimation and Control [52.28457815067461]
We show that the regret-optimal estimator and regret-optimal controller can be derived in state-space form.
We propose regret-optimal analogs of Model-Predictive Control (MPC) and the Extended KalmanFilter (EKF) for systems with nonlinear dynamics.
arXiv Detail & Related papers (2021-06-22T23:14:21Z) - Regret Analysis of Distributed Online LQR Control for Unknown LTI
Systems [8.832969171530056]
We study the distributed online linear quadratic regulator (LQR) problem for linear time-invariant (LTI) systems with unknown dynamics.
We propose a distributed variant of the online LQR algorithm where each agent computes its system estimate during an exploration stage.
We prove that our proposed algorithm scales $tildeO(T2/3)$, implying the consensus of the network over time.
arXiv Detail & Related papers (2021-05-15T23:02:58Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z) - Logarithmic Regret for Adversarial Online Control [56.12283443161479]
We give the first algorithm with logarithmic regret for arbitrary adversarial disturbance sequences.
Our algorithm and analysis use a characterization for the offline control law to reduce the online control problem to (delayed) online learning.
arXiv Detail & Related papers (2020-02-29T06:29:19Z) - Regret Minimization in Partially Observable Linear Quadratic Control [91.43582419264763]
We study the problem of regret in partially observable linear quadratic control systems when the model dynamics are unknown a priori.
We propose a novel way to decompose the regret and provide an end-to-end sublinear regret upper bound for partially observable linear quadratic control.
arXiv Detail & Related papers (2020-01-31T22:35:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.