Distributed Online Linear Quadratic Control for Linear Time-invariant
Systems
- URL: http://arxiv.org/abs/2009.13749v1
- Date: Tue, 29 Sep 2020 03:30:49 GMT
- Title: Distributed Online Linear Quadratic Control for Linear Time-invariant
Systems
- Authors: Ting-Jui Chang, Shahin Shahrampour
- Abstract summary: We study the distributed online linear quadratic (LQ) problem for identical linear time-invariant (LTI) systems.
Consider a multi-agent network where each agent is modeled as an LTI system.
We develop a distributed variant of the online LQ algorithm, which runs distributed online gradient descent with a projection to a semi-definite programming (SDP) to generate controllers.
- Score: 14.924672048447334
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Classical linear quadratic (LQ) control centers around linear time-invariant
(LTI) systems, where the control-state pairs introduce a quadratic cost with
time-invariant parameters. Recent advancement in online optimization and
control has provided novel tools to study LQ problems that are robust to
time-varying cost parameters. Inspired by this line of research, we study the
distributed online LQ problem for identical LTI systems. Consider a multi-agent
network where each agent is modeled as an LTI system. The LTI systems are
associated with decoupled, time-varying quadratic costs that are revealed
sequentially. The goal of the network is to make the control sequence of all
agents competitive to that of the best centralized policy in hindsight,
captured by the notion of regret. We develop a distributed variant of the
online LQ algorithm, which runs distributed online gradient descent with a
projection to a semi-definite programming (SDP) to generate controllers. We
establish a regret bound scaling as the square root of the finite time-horizon,
implying that agents reach consensus as time grows. We further provide
numerical experiments verifying our theoretical result.
Related papers
- Regret Analysis of Distributed Online Control for LTI Systems with
Adversarial Disturbances [12.201535821920624]
This paper addresses the distributed online control problem over a network of linear time-invariant (LTI) systems with possibly unknown dynamics.
For known dynamics, we propose a fully distributed disturbance feedback controller that guarantees a regret bound of $O(sqrtTlog T)$.
For the unknown dynamics case, we design a distributed explore-then-commit approach, where in the exploration phase all agents jointly learn the system dynamics.
arXiv Detail & Related papers (2023-10-04T23:24:39Z) - Regret Analysis of Online LQR Control via Trajectory Prediction and
Tracking: Extended Version [1.6344851071810074]
We propose and analyze a new method for online linear quadratic regulator (LQR) control with a priori unknown time-varying cost matrices.
Our novel method involves using the available cost matrices to predict the optimal trajectory, and a tracking controller to drive the system towards it.
We show in simulations that our proposed method offers improved performance compared to other previously proposed online LQR methods.
arXiv Detail & Related papers (2023-02-21T02:48:57Z) - Learning Mixtures of Linear Dynamical Systems [94.49754087817931]
We develop a two-stage meta-algorithm to efficiently recover each ground-truth LDS model up to error $tildeO(sqrtd/T)$.
We validate our theoretical studies with numerical experiments, confirming the efficacy of the proposed algorithm.
arXiv Detail & Related papers (2022-01-26T22:26:01Z) - Finite-time System Identification and Adaptive Control in Autoregressive
Exogenous Systems [79.67879934935661]
We study the problem of system identification and adaptive control of unknown ARX systems.
We provide finite-time learning guarantees for the ARX systems under both open-loop and closed-loop data collection.
arXiv Detail & Related papers (2021-08-26T18:00:00Z) - Regret Analysis of Distributed Online LQR Control for Unknown LTI
Systems [8.832969171530056]
We study the distributed online linear quadratic regulator (LQR) problem for linear time-invariant (LTI) systems with unknown dynamics.
We propose a distributed variant of the online LQR algorithm where each agent computes its system estimate during an exploration stage.
We prove that our proposed algorithm scales $tildeO(T2/3)$, implying the consensus of the network over time.
arXiv Detail & Related papers (2021-05-15T23:02:58Z) - Stable Online Control of Linear Time-Varying Systems [49.41696101740271]
COCO-LQ is an efficient online control algorithm that guarantees input-to-state stability for a large class of LTV systems.
We empirically demonstrate the performance of COCO-LQ in both synthetic experiments and a power system frequency control example.
arXiv Detail & Related papers (2021-04-29T06:18:49Z) - Decomposability and Parallel Computation of Multi-Agent LQR [19.710361049812608]
We propose a parallel RL scheme for a linear regulator (LQR) design in a continuous-time linear MAS.
We show that if the MAS is homogeneous then this decomposition retains closed-loop optimality.
The proposed approach can guarantee significant speed-up in learning without any loss in the cumulative value of the LQR cost.
arXiv Detail & Related papers (2020-10-16T20:15:39Z) - Logarithmic Regret Bound in Partially Observable Linear Dynamical
Systems [91.43582419264763]
We study the problem of system identification and adaptive control in partially observable linear dynamical systems.
We present the first model estimation method with finite-time guarantees in both open and closed-loop system identification.
We show that AdaptOn is the first algorithm that achieves $textpolylogleft(Tright)$ regret in adaptive control of unknown partially observable linear dynamical systems.
arXiv Detail & Related papers (2020-03-25T06:00:33Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z) - Improper Learning for Non-Stochastic Control [78.65807250350755]
We consider the problem of controlling a possibly unknown linear dynamical system with adversarial perturbations, adversarially chosen convex loss functions, and partially observed states.
Applying online descent to this parametrization yields a new controller which attains sublinear regret vs. a large class of closed-loop policies.
Our bounds are the first in the non-stochastic control setting that compete with emphall stabilizing linear dynamical controllers.
arXiv Detail & Related papers (2020-01-25T02:12:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.