Data-Driven Adversarial Online Control for Unknown Linear Systems
- URL: http://arxiv.org/abs/2308.08138v2
- Date: Sat, 9 Mar 2024 04:18:46 GMT
- Title: Data-Driven Adversarial Online Control for Unknown Linear Systems
- Authors: Zishun Liu and Yongxin Chen
- Abstract summary: We present a novel data-driven online adaptive control algorithm to address this online control problem.
Our algorithm guarantees an $tmO(T2/3)$ regret gradient bound with high probability, which matches the best-known regret bound for this problem.
- Score: 17.595231077524467
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the online control problem with an unknown linear dynamical
system in the presence of adversarial perturbations and adversarial convex loss
functions. Although the problem is widely studied in model-based control, it
remains unclear whether data-driven approaches, which bypass the system
identification step, can solve the problem. In this work, we present a novel
data-driven online adaptive control algorithm to address this online control
problem. Our algorithm leverages the behavioral systems theory to learn a
non-parametric system representation and then adopts a perturbation-based
controller updated by online gradient descent. We prove that our algorithm
guarantees an $\tmO(T^{2/3})$ regret bound with high probability, which matches
the best-known regret bound for this problem. Furthermore, we extend our
algorithm and performance guarantee to the cases with output feedback.
Related papers
- Data-Guided Regulator for Adaptive Nonlinear Control [0.27195102129094995]
This paper addresses the problem of a data-driven feedback controller for complex nonlinear systems.
The goal is to achieve finite-time regulation of system states through direct policy updates.
arXiv Detail & Related papers (2023-11-20T23:02:39Z) - Learning to Control under Time-Varying Environment [18.48729114775298]
This paper investigates the problem of regret in linear time-varying (LTV) dynamical systems.
We propose the first computationally tractable online algorithm with regret guarantees.
arXiv Detail & Related papers (2022-06-06T11:40:46Z) - Online Control of Unknown Time-Varying Dynamical Systems [48.75672260851758]
We study online control of time-varying linear systems with unknown dynamics in the nonstochastic control model.
We study regret bounds with respect to common classes of policies: Disturbance Action (SLS), Disturbance Response (Youla), and linear feedback policies.
arXiv Detail & Related papers (2022-02-16T06:57:14Z) - Finite-time System Identification and Adaptive Control in Autoregressive
Exogenous Systems [79.67879934935661]
We study the problem of system identification and adaptive control of unknown ARX systems.
We provide finite-time learning guarantees for the ARX systems under both open-loop and closed-loop data collection.
arXiv Detail & Related papers (2021-08-26T18:00:00Z) - Non-Episodic Learning for Online LQR of Unknown Linear Gaussian System [0.0]
We propose an online non-episodic algorithm that gains knowledge about the system from a single trajectory.
We characterize the almost sure convergence rates of identification and control, and reveal an optimal trade-off between exploration and exploitation.
arXiv Detail & Related papers (2021-03-24T15:51:28Z) - Non-stationary Online Learning with Memory and Non-stochastic Control [71.14503310914799]
We study the problem of Online Convex Optimization (OCO) with memory, which allows loss functions to depend on past decisions.
In this paper, we introduce dynamic policy regret as the performance measure to design algorithms robust to non-stationary environments.
We propose a novel algorithm for OCO with memory that provably enjoys an optimal dynamic policy regret in terms of time horizon, non-stationarity measure, and memory length.
arXiv Detail & Related papers (2021-02-07T09:45:15Z) - Non-Stochastic Control with Bandit Feedback [30.33117611898598]
We give an efficient sublinear regret algorithm for a known or unknown system.
The main algorithmic difficulty is the dependence of the loss on past controls.
We propose an efficient algorithm for the general setting of bandit convex optimization for loss functions with memory.
arXiv Detail & Related papers (2020-08-12T18:40:00Z) - Logarithmic Regret Bound in Partially Observable Linear Dynamical
Systems [91.43582419264763]
We study the problem of system identification and adaptive control in partially observable linear dynamical systems.
We present the first model estimation method with finite-time guarantees in both open and closed-loop system identification.
We show that AdaptOn is the first algorithm that achieves $textpolylogleft(Tright)$ regret in adaptive control of unknown partially observable linear dynamical systems.
arXiv Detail & Related papers (2020-03-25T06:00:33Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z) - Logarithmic Regret for Adversarial Online Control [56.12283443161479]
We give the first algorithm with logarithmic regret for arbitrary adversarial disturbance sequences.
Our algorithm and analysis use a characterization for the offline control law to reduce the online control problem to (delayed) online learning.
arXiv Detail & Related papers (2020-02-29T06:29:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.