Learning the Kalman Filter with Fine-Grained Sample Complexity
- URL: http://arxiv.org/abs/2301.12624v1
- Date: Mon, 30 Jan 2023 02:41:18 GMT
- Title: Learning the Kalman Filter with Fine-Grained Sample Complexity
- Authors: Xiangyuan Zhang, Bin Hu, Tamer Ba\c{s}ar
- Abstract summary: We develop the first end-to-end sample complexity of model-free policy gradient (PG) methods in discrete-time infinite-horizon Kalman filtering.
Our results shed light on applying model-free PG methods to control a linear dynamical system where the state measurements could be corrupted by statistical noises and other (possibly adversarial) disturbances.
- Score: 4.301206378997673
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We develop the first end-to-end sample complexity of model-free policy
gradient (PG) methods in discrete-time infinite-horizon Kalman filtering.
Specifically, we introduce the receding-horizon policy gradient (RHPG-KF)
framework and demonstrate $\tilde{\mathcal{O}}(\epsilon^{-2})$ sample
complexity for RHPG-KF in learning a stabilizing filter that is
$\epsilon$-close to the optimal Kalman filter. Notably, the proposed RHPG-KF
framework does not require the system to be open-loop stable nor assume any
prior knowledge of a stabilizing filter. Our results shed light on applying
model-free PG methods to control a linear dynamical system where the state
measurements could be corrupted by statistical noises and other (possibly
adversarial) disturbances.
Related papers
- Closed-form Filtering for Non-linear Systems [83.91296397912218]
We propose a new class of filters based on Gaussian PSD Models, which offer several advantages in terms of density approximation and computational efficiency.
We show that filtering can be efficiently performed in closed form when transitions and observations are Gaussian PSD Models.
Our proposed estimator enjoys strong theoretical guarantees, with estimation error that depends on the quality of the approximation and is adaptive to the regularity of the transition probabilities.
arXiv Detail & Related papers (2024-02-15T08:51:49Z) - Global Convergence of Receding-Horizon Policy Search in Learning
Estimator Designs [3.0811185425377743]
We introduce the receding-horizon policy estimator (RHPG) algorithm.
RHPG is the first algorithm with provable global convergence in learning optimal linear policy estimator.
arXiv Detail & Related papers (2023-09-09T16:03:49Z) - Revisiting LQR Control from the Perspective of Receding-Horizon Policy
Gradient [2.1756081703276]
We revisit the discrete-time linear quadratic regulator (LQR) problem from the perspective of receding-horizon policy gradient (RHPG)
We provide a fine-grained sample analysis for G to learn a control policy that is both stabilizing and $epsilon-close to the optimal LQR solution.
arXiv Detail & Related papers (2023-02-25T19:16:40Z) - Globally Convergent Policy Search over Dynamic Filters for Output
Estimation [64.90951294952094]
We introduce the first direct policy search algorithm convex which converges to the globally optimal $textitdynamic$ filter.
We show that informativity overcomes the aforementioned degeneracy.
arXiv Detail & Related papers (2022-02-23T18:06:20Z) - Inverse Extended Kalman Filter -- Part I: Fundamentals [19.078991171384015]
In this paper, we develop the theory of inverse extended Kalman filter (I-EKF) in detail.
We provide theoretical stability guarantees using both bounded non-linearity and unknown matrix approaches.
In the companion paper (Part II), we propose reproducing kernel Hilbert space-based EKF to handle incomplete system model information.
arXiv Detail & Related papers (2022-01-05T10:56:58Z) - Improper Learning with Gradient-based Policy Optimization [62.50997487685586]
We consider an improper reinforcement learning setting where the learner is given M base controllers for an unknown Markov Decision Process.
We propose a gradient-based approach that operates over a class of improper mixtures of the controllers.
arXiv Detail & Related papers (2021-02-16T14:53:55Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Zeroth-order Deterministic Policy Gradient [116.87117204825105]
We introduce Zeroth-order Deterministic Policy Gradient (ZDPG)
ZDPG approximates policy-reward gradients via two-point evaluations of the $Q$function.
New finite sample complexity bounds for ZDPG improve upon existing results by up to two orders of magnitude.
arXiv Detail & Related papers (2020-06-12T16:52:29Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z) - Sample Complexity of Kalman Filtering for Unknown Systems [21.565920482293592]
We consider the task of designing a Kalman Filter (KF) for an unknown and partially observed autonomous linear time invariant system driven by process and sensor noise.
We show that when the system identification step produces sufficiently accurate estimates, a Certainty Equivalent (CE) KF enjoys provable sub-optimality guarantees.
arXiv Detail & Related papers (2019-12-27T19:00:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.