Related papers: Lyapunov-Based Reinforcement Learning State Estimator

Lyapunov-Based Reinforcement Learning State Estimator

URL: http://arxiv.org/abs/2010.13529v2
Date: Thu, 7 Jan 2021 16:28:14 GMT
Title: Lyapunov-Based Reinforcement Learning State Estimator
Authors: Liang Hu, Chengwei Wu, Wei Pan
Abstract summary: We consider the state estimation problem for nonlinear discrete-time systems. We combine Lyapunov's method in control theory and deep reinforcement learning to design the state estimator. An actor-critic reinforcement learning algorithm is proposed to learn the state estimator approximated by a deep neural network.
Score: 9.356469388299928
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we consider the state estimation problem for nonlinear stochastic discrete-time systems. We combine Lyapunov's method in control theory and deep reinforcement learning to design the state estimator. We theoretically prove the convergence of the bounded estimate error solely using the data simulated from the model. An actor-critic reinforcement learning algorithm is proposed to learn the state estimator approximated by a deep neural network. The convergence of the algorithm is analysed. The proposed Lyapunov-based reinforcement learning state estimator is compared with a number of existing nonlinear filtering methods through Monte Carlo simulations, showing its advantage in terms of estimate convergence even under some system uncertainties such as covariance shift in system noise and randomly missing measurements. To the best of our knowledge, this is the first reinforcement learning based nonlinear state estimator with bounded estimate error performance guarantee.

Related papers

Uncertainty quantification for Markov chains with application to temporal difference learning [63.49764856675643]
We develop novel high-dimensional concentration inequalities and Berry-Esseen bounds for vector- and matrix-valued functions of Markov chains. We analyze the TD learning algorithm, a widely used method for policy evaluation in reinforcement learning.
arXiv Detail & Related papers (2025-02-19T15:33:55Z)
Learning Latent Graph Structures and their Uncertainty [63.95971478893842]
Graph Neural Networks (GNNs) use relational information as an inductive bias to enhance the model's accuracy. As task-relevant relations might be unknown, graph structure learning approaches have been proposed to learn them while solving the downstream prediction task.
arXiv Detail & Related papers (2024-05-30T10:49:22Z)
Sample-efficient estimation of entanglement entropy through supervised learning [0.0]
We put a particular focus on estimating aleatoric and epistemic uncertainty of the network's estimate. We observe convergence in a regime of sample sizes in which the baseline method fails to give correct estimates. As a further application of our method, highly relevant for quantum simulation experiments, we estimate the quantum mutual information for non-unitary evolution.
arXiv Detail & Related papers (2023-09-14T09:38:14Z)
Online machine-learning forecast uncertainty estimation for sequential data assimilation [0.0]
Quantifying forecast uncertainty is a key aspect of state-of-the-art numerical weather prediction and data assimilation systems. In this work a machine learning method is presented based on convolutional neural networks that estimates the state-dependent forecast uncertainty. The hybrid data assimilation method shows similar performance to the ensemble Kalman filter outperforming it when the ensembles are relatively small.
arXiv Detail & Related papers (2023-05-12T19:23:21Z)
Neural State-Space Models: Empirical Evaluation of Uncertainty Quantification [0.0]
This paper presents preliminary results on uncertainty quantification for system identification with neural state-space models. We frame the learning problem in a Bayesian probabilistic setting and obtain posterior distributions for the neural network's weights and outputs. Based on the posterior, we construct credible intervals on the outputs and define a surprise index which can effectively diagnose usage of the model in a potentially dangerous out-of-distribution regime.
arXiv Detail & Related papers (2023-04-13T08:57:33Z)
Scalable computation of prediction intervals for neural networks via matrix sketching [79.44177623781043]
Existing algorithms for uncertainty estimation require modifying the model architecture and training procedure. This work proposes a new algorithm that can be applied to a given trained neural network and produces approximate prediction intervals.
arXiv Detail & Related papers (2022-05-06T13:18:31Z)
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity [51.476337785345436]
We study a pessimistic variant of Q-learning in the context of finite-horizon Markov decision processes. A variance-reduced pessimistic Q-learning algorithm is proposed to achieve near-optimal sample complexity.
arXiv Detail & Related papers (2022-02-28T15:39:36Z)
Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models. In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints. A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z)
COMBO: Conservative Offline Model-Based Policy Optimization [120.55713363569845]
Uncertainty estimation with complex models, such as deep neural networks, can be difficult and unreliable. We develop a new model-based offline RL algorithm, COMBO, that regularizes the value function on out-of-support state-actions. We find that COMBO consistently performs as well or better as compared to prior offline model-free and model-based methods.
arXiv Detail & Related papers (2021-02-16T18:50:32Z)
The Aleatoric Uncertainty Estimation Using a Separate Formulation with Virtual Residuals [51.71066839337174]
Existing methods can quantify the error in the target estimation, but they tend to underestimate it. We propose a new separable formulation for the estimation of a signal and of its uncertainty, avoiding the effect of overfitting. We demonstrate that the proposed method outperforms a state-of-the-art technique for signal and uncertainty estimation.
arXiv Detail & Related papers (2020-11-03T12:11:27Z)
Data Assimilation Networks [1.5545257664210517]
Data assimilation aims at forecasting the state of a dynamical system by combining a mathematical representation of the system with noisy observations. We propose a fully data driven deep learning architecture generalizing recurrent Elman networks and data assimilation algorithms. Our architecture achieves comparable performance to EnKF on both the analysis and the propagation of probability density functions of the system state at a given time without using any explicit regularization technique.
arXiv Detail & Related papers (2020-10-19T17:35:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.