Regret Analysis of Distributed Online Control for LTI Systems with
Adversarial Disturbances
- URL: http://arxiv.org/abs/2310.03206v1
- Date: Wed, 4 Oct 2023 23:24:39 GMT
- Title: Regret Analysis of Distributed Online Control for LTI Systems with
Adversarial Disturbances
- Authors: Ting-Jui Chang and Shahin Shahrampour
- Abstract summary: This paper addresses the distributed online control problem over a network of linear time-invariant (LTI) systems with possibly unknown dynamics.
For known dynamics, we propose a fully distributed disturbance feedback controller that guarantees a regret bound of $O(sqrtTlog T)$.
For the unknown dynamics case, we design a distributed explore-then-commit approach, where in the exploration phase all agents jointly learn the system dynamics.
- Score: 12.201535821920624
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the distributed online control problem over a network of
linear time-invariant (LTI) systems (with possibly unknown dynamics) in the
presence of adversarial perturbations. There exists a global network cost that
is characterized by a time-varying convex function, which evolves in an
adversarial manner and is sequentially and partially observed by local agents.
The goal of each agent is to generate a control sequence that can compete with
the best centralized control policy in hindsight, which has access to the
global cost. This problem is formulated as a regret minimization. For the case
of known dynamics, we propose a fully distributed disturbance feedback
controller that guarantees a regret bound of $O(\sqrt{T}\log T)$, where $T$ is
the time horizon. For the unknown dynamics case, we design a distributed
explore-then-commit approach, where in the exploration phase all agents jointly
learn the system dynamics, and in the learning phase our proposed control
algorithm is applied using each agent system estimate. We establish a regret
bound of $O(T^{2/3} \text{poly}(\log T))$ for this setting.
Related papers
- Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement
Learning Approach [130.9259586568977]
We propose novel learning algorithms to recover the dynamic Vickrey-Clarke-Grove (VCG) mechanism over multiple rounds of interaction.
A key contribution of our approach is incorporating reward-free online Reinforcement Learning (RL) to aid exploration over a rich policy space.
arXiv Detail & Related papers (2022-02-25T16:17:23Z) - Finite-time System Identification and Adaptive Control in Autoregressive
Exogenous Systems [79.67879934935661]
We study the problem of system identification and adaptive control of unknown ARX systems.
We provide finite-time learning guarantees for the ARX systems under both open-loop and closed-loop data collection.
arXiv Detail & Related papers (2021-08-26T18:00:00Z) - Regret Analysis of Distributed Online LQR Control for Unknown LTI
Systems [8.832969171530056]
We study the distributed online linear quadratic regulator (LQR) problem for linear time-invariant (LTI) systems with unknown dynamics.
We propose a distributed variant of the online LQR algorithm where each agent computes its system estimate during an exploration stage.
We prove that our proposed algorithm scales $tildeO(T2/3)$, implying the consensus of the network over time.
arXiv Detail & Related papers (2021-05-15T23:02:58Z) - Decentralized Control with Graph Neural Networks [147.84766857793247]
We propose a novel framework using graph neural networks (GNNs) to learn decentralized controllers.
GNNs are well-suited for the task since they are naturally distributed architectures and exhibit good scalability and transferability properties.
The problems of flocking and multi-agent path planning are explored to illustrate the potential of GNNs in learning decentralized controllers.
arXiv Detail & Related papers (2020-12-29T18:59:14Z) - Decentralized Multi-Agent Linear Bandits with Safety Constraints [31.67685495996986]
We study decentralized linear bandits, where a network of $N$ agents acts cooperatively to solve a linear bandit-optimization problem.
We propose DLUCB: a fully decentralized algorithm that minimizes the cumulative regret over the entire network.
We show that our ideas extend naturally to the emerging, albeit more challenging, setting of safe bandits.
arXiv Detail & Related papers (2020-12-01T07:33:00Z) - Distributed Online Linear Quadratic Control for Linear Time-invariant
Systems [14.924672048447334]
We study the distributed online linear quadratic (LQ) problem for identical linear time-invariant (LTI) systems.
Consider a multi-agent network where each agent is modeled as an LTI system.
We develop a distributed variant of the online LQ algorithm, which runs distributed online gradient descent with a projection to a semi-definite programming (SDP) to generate controllers.
arXiv Detail & Related papers (2020-09-29T03:30:49Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z) - Logarithmic Regret for Adversarial Online Control [56.12283443161479]
We give the first algorithm with logarithmic regret for arbitrary adversarial disturbance sequences.
Our algorithm and analysis use a characterization for the offline control law to reduce the online control problem to (delayed) online learning.
arXiv Detail & Related papers (2020-02-29T06:29:19Z) - Improper Learning for Non-Stochastic Control [78.65807250350755]
We consider the problem of controlling a possibly unknown linear dynamical system with adversarial perturbations, adversarially chosen convex loss functions, and partially observed states.
Applying online descent to this parametrization yields a new controller which attains sublinear regret vs. a large class of closed-loop policies.
Our bounds are the first in the non-stochastic control setting that compete with emphall stabilizing linear dynamical controllers.
arXiv Detail & Related papers (2020-01-25T02:12:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.