Closed-loop Parameter Identification of Linear Dynamical Systems through
the Lens of Feedback Channel Coding Theory
- URL: http://arxiv.org/abs/2003.12548v1
- Date: Fri, 27 Mar 2020 17:30:10 GMT
- Title: Closed-loop Parameter Identification of Linear Dynamical Systems through
the Lens of Feedback Channel Coding Theory
- Authors: Ali Reza Pedram and Takashi Tanaka
- Abstract summary: This paper considers the problem of closed-loop identification of linear scalar systems with Gaussian process noise.
We show that the learning rate is fundamentally upper bounded by the capacity of the corresponding AWGN channel.
Although the optimal design of the feedback policy remains challenging, we derive conditions under which the upper bound is achieved.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper considers the problem of closed-loop identification of linear
scalar systems with Gaussian process noise, where the system input is
determined by a deterministic state feedback policy. The regularized
least-square estimate (LSE) algorithm is adopted, seeking to find the best
estimate of unknown model parameters based on noiseless measurements of the
state. We are interested in the fundamental limitation of the rate at which
unknown parameters can be learned, in the sense of the D-optimality
scalarization criterion subject to a quadratic control cost. We first establish
a novel connection between a closed-loop identification problem of interest and
a channel coding problem involving an additive white Gaussian noise (AWGN)
channel with feedback and a certain structural constraint. Based on this
connection, we show that the learning rate is fundamentally upper bounded by
the capacity of the corresponding AWGN channel. Although the optimal design of
the feedback policy remains challenging, we derive conditions under which the
upper bound is achieved. Finally, we show that the obtained upper bound implies
that super-linear convergence is unattainable for any choice of the policy.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - A least-square method for non-asymptotic identification in linear switching control [17.938732931331064]
It is known that the underlying partially-observed linear dynamical system lies within a finite collection of known candidate models.
We characterize the finite-time sample complexity of this problem by leveraging recent advances in the non-asymptotic analysis of linear least-square methods.
We propose a data-driven switching strategy that identifies the unknown parameters of the underlying system.
arXiv Detail & Related papers (2024-04-11T20:55:38Z) - Towards Model-Free LQR Control over Rate-Limited Channels [2.908482270923597]
We study a setting where a worker agent transmits quantized policy gradients (of the LQR cost) to a server over a noiseless channel with a finite bit-rate.
We propose a new algorithm titled Adaptively Quantized Gradient Descent (textttAQGD), and prove that above a certain finite threshold bit-rate, textttAQGD guarantees exponentially fast convergence to the globally optimal policy.
arXiv Detail & Related papers (2024-01-02T15:59:00Z) - Sub-linear Regret in Adaptive Model Predictive Control [56.705978425244496]
We present STT-MPC (Self-Tuning Tube-based Model Predictive Control), an online oracle that combines the certainty-equivalence principle and polytopic tubes.
We analyze the regret of the algorithm, when compared to an algorithm initially aware of the system dynamics.
arXiv Detail & Related papers (2023-10-07T15:07:10Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - A Priori Denoising Strategies for Sparse Identification of Nonlinear
Dynamical Systems: A Comparative Study [68.8204255655161]
We investigate and compare the performance of several local and global smoothing techniques to a priori denoise the state measurements.
We show that, in general, global methods, which use the entire measurement data set, outperform local methods, which employ a neighboring data subset around a local point.
arXiv Detail & Related papers (2022-01-29T23:31:25Z) - Formal Verification of Stochastic Systems with ReLU Neural Network
Controllers [22.68044012584378]
We address the problem of formal safety verification for cyber-physical systems equipped with ReLU neural network (NN) controllers.
Our goal is to find the set of initial states from where, with a predetermined confidence, the system will not reach an unsafe configuration.
arXiv Detail & Related papers (2021-03-08T23:53:13Z) - Correct-by-construction reach-avoid control of partially observable
linear stochastic systems [7.912008109232803]
We formalize a robust feedback controller for reach-avoid control of discrete-time, linear time-invariant systems.
The problem is to compute a controller that satisfies the required provestate abstraction problem.
arXiv Detail & Related papers (2021-03-03T13:46:52Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z) - Convergence and sample complexity of gradient methods for the model-free
linear quadratic regulator problem [27.09339991866556]
We show that ODE searches for optimal control for an unknown computation system by directly searching over the corresponding space of controllers.
We take a step towards demystifying the performance and efficiency of such methods by focusing on the gradient-flow dynamics set of stabilizing feedback gains and a similar result holds for the forward disctization of the ODE.
arXiv Detail & Related papers (2019-12-26T16:56:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.