Identification and Adaptive Control of Markov Jump Systems: Sample
Complexity and Regret Bounds
- URL: http://arxiv.org/abs/2111.07018v1
- Date: Sat, 13 Nov 2021 02:38:13 GMT
- Title: Identification and Adaptive Control of Markov Jump Systems: Sample
Complexity and Regret Bounds
- Authors: Yahya Sattar and Zhe Du and Davoud Ataee Tarzanagh and Laura Balzano
and Necmiye Ozay and Samet Oymak
- Abstract summary: We consider the problem of controlling an unknown Markov jump linear system (MJS) to optimize a quadratic objective.
We first provide a system identification algorithm for MJS to learn the dynamics in each mode.
We then propose an adaptive control scheme that performs system identification together with certainty equivalent control.
- Score: 24.74448154832031
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning how to effectively control unknown dynamical systems is crucial for
intelligent autonomous systems. This task becomes a significant challenge when
the underlying dynamics are changing with time. Motivated by this challenge,
this paper considers the problem of controlling an unknown Markov jump linear
system (MJS) to optimize a quadratic objective. By taking a model-based
perspective, we consider identification-based adaptive control for MJSs. We
first provide a system identification algorithm for MJS to learn the dynamics
in each mode as well as the Markov transition matrix, underlying the evolution
of the mode switches, from a single trajectory of the system states, inputs,
and modes. Through mixing-time arguments, sample complexity of this algorithm
is shown to be $\mathcal{O}(1/\sqrt{T})$. We then propose an adaptive control
scheme that performs system identification together with certainty equivalent
control to adapt the controllers in an episodic fashion. Combining our sample
complexity results with recent perturbation results for certainty equivalent
control, we prove that when the episode lengths are appropriately chosen, the
proposed adaptive control scheme achieves $\mathcal{O}(\sqrt{T})$ regret, which
can be improved to $\mathcal{O}(polylog(T))$ with partial knowledge of the
system. Our proof strategy introduces innovations to handle Markovian jumps and
a weaker notion of stability common in MJSs. Our analysis provides insights
into system theoretic quantities that affect learning accuracy and control
performance. Numerical simulations are presented to further reinforce these
insights.
Related papers
- Sub-linear Regret in Adaptive Model Predictive Control [56.705978425244496]
We present STT-MPC (Self-Tuning Tube-based Model Predictive Control), an online oracle that combines the certainty-equivalence principle and polytopic tubes.
We analyze the regret of the algorithm, when compared to an algorithm initially aware of the system dynamics.
arXiv Detail & Related papers (2023-10-07T15:07:10Z) - Formal Controller Synthesis for Markov Jump Linear Systems with
Uncertain Dynamics [64.72260320446158]
We propose a method for synthesising controllers for Markov jump linear systems.
Our method is based on a finite-state abstraction that captures both the discrete (mode-jumping) and continuous (stochastic linear) behaviour of the MJLS.
We apply our method to multiple realistic benchmark problems, in particular, a temperature control and an aerial vehicle delivery problem.
arXiv Detail & Related papers (2022-12-01T17:36:30Z) - Mode Reduction for Markov Jump Systems [8.450188319487989]
We consider Markov jump linear systems (MJSs), a special class of switched systems where the active mode switches according to a Markov chain.
Inspired by clustering techniques from unsupervised learning, we are able to construct a reduced MJS with fewer modes.
We show how one can use the reduced MJS to analyze stability and design controllers with significant reduction in computational cost.
arXiv Detail & Related papers (2022-05-05T15:06:10Z) - Finite-time System Identification and Adaptive Control in Autoregressive
Exogenous Systems [79.67879934935661]
We study the problem of system identification and adaptive control of unknown ARX systems.
We provide finite-time learning guarantees for the ARX systems under both open-loop and closed-loop data collection.
arXiv Detail & Related papers (2021-08-26T18:00:00Z) - Certainty Equivalent Quadratic Control for Markov Jump Systems [24.744481548320305]
We investigate robustness aspects of certainty equivalent model-based optimal control for MJS with quadratic cost function.
We provide explicit perturbation bounds which decay as $mathcalO(epsilon + eta)$ and $mathcalO((epsilon + eta)2)$ respectively.
arXiv Detail & Related papers (2021-05-26T06:45:47Z) - A Novel Anomaly Detection Algorithm for Hybrid Production Systems based
on Deep Learning and Timed Automata [73.38551379469533]
DAD:DeepAnomalyDetection is a new approach for automatic model learning and anomaly detection in hybrid production systems.
It combines deep learning and timed automata for creating behavioral model from observations.
The algorithm has been applied to few data sets including two from real systems and has shown promising results.
arXiv Detail & Related papers (2020-10-29T08:27:43Z) - Logarithmic Regret Bound in Partially Observable Linear Dynamical
Systems [91.43582419264763]
We study the problem of system identification and adaptive control in partially observable linear dynamical systems.
We present the first model estimation method with finite-time guarantees in both open and closed-loop system identification.
We show that AdaptOn is the first algorithm that achieves $textpolylogleft(Tright)$ regret in adaptive control of unknown partially observable linear dynamical systems.
arXiv Detail & Related papers (2020-03-25T06:00:33Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.