Temporal Difference Learning with Continuous Time and State in the
Stochastic Setting
- URL: http://arxiv.org/abs/2202.07960v3
- Date: Wed, 7 Jun 2023 12:18:33 GMT
- Title: Temporal Difference Learning with Continuous Time and State in the
Stochastic Setting
- Authors: Ziad Kobeissi (SIERRA), Francis Bach (SIERRA, DI-ENS, PSL)
- Abstract summary: We consider the problem of continuous-time policy evaluation.
This consists in learning through observations the value function associated with an uncontrolled continuous-time dynamic and a reward function.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of continuous-time policy evaluation. This consists
in learning through observations the value function associated with an
uncontrolled continuous-time stochastic dynamic and a reward function. We
propose two original variants of the well-known TD(0) method using vanishing
time steps. One is model-free and the other is model-based. For both methods,
we prove theoretical convergence rates that we subsequently verify through
numerical simulations. Alternatively, those methods can be interpreted as novel
reinforcement learning approaches for approximating solutions of linear PDEs
(partial differential equations) or linear BSDEs (backward stochastic
differential equations).
Related papers
- Learning Controlled Stochastic Differential Equations [61.82896036131116]
This work proposes a novel method for estimating both drift and diffusion coefficients of continuous, multidimensional, nonlinear controlled differential equations with non-uniform diffusion.
We provide strong theoretical guarantees, including finite-sample bounds for (L2), (Linfty), and risk metrics, with learning rates adaptive to coefficients' regularity.
Our method is available as an open-source Python library.
arXiv Detail & Related papers (2024-11-04T11:09:58Z) - A Training-Free Conditional Diffusion Model for Learning Stochastic Dynamical Systems [10.820654486318336]
This study introduces a training-free conditional diffusion model for learning unknown differential equations (SDEs) using data.
The proposed approach addresses key challenges in computational efficiency and accuracy for modeling SDEs.
The learned models exhibit significant improvements in predicting both short-term and long-term behaviors of unknown systems.
arXiv Detail & Related papers (2024-10-04T03:07:36Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Semi-supervised Learning of Partial Differential Operators and Dynamical
Flows [68.77595310155365]
We present a novel method that combines a hyper-network solver with a Fourier Neural Operator architecture.
We test our method on various time evolution PDEs, including nonlinear fluid flows in one, two, and three spatial dimensions.
The results show that the new method improves the learning accuracy at the time point of supervision point, and is able to interpolate and the solutions to any intermediate time.
arXiv Detail & Related papers (2022-07-28T19:59:14Z) - Continuous-Time Modeling of Counterfactual Outcomes Using Neural
Controlled Differential Equations [84.42837346400151]
Estimating counterfactual outcomes over time has the potential to unlock personalized healthcare.
Existing causal inference approaches consider regular, discrete-time intervals between observations and treatment decisions.
We propose a controllable simulation environment based on a model of tumor growth for a range of scenarios.
arXiv Detail & Related papers (2022-06-16T17:15:15Z) - The Connection between Discrete- and Continuous-Time Descriptions of
Gaussian Continuous Processes [60.35125735474386]
We show that discretizations yielding consistent estimators have the property of invariance under coarse-graining'
This result explains why combining differencing schemes for derivatives reconstruction and local-in-time inference approaches does not work for time series analysis of second or higher order differential equations.
arXiv Detail & Related papers (2021-01-16T17:11:02Z) - Training Generative Adversarial Networks by Solving Ordinary
Differential Equations [54.23691425062034]
We study the continuous-time dynamics induced by GAN training.
From this perspective, we hypothesise that instabilities in training GANs arise from the integration error.
We experimentally verify that well-known ODE solvers (such as Runge-Kutta) can stabilise training.
arXiv Detail & Related papers (2020-10-28T15:23:49Z) - Identifying Latent Stochastic Differential Equations [29.103393300261587]
We present a method for learning latent differential equations (SDEs) from high-dimensional time series data.
The proposed method learns the mapping from ambient to latent space, and the underlying SDE coefficients, through a self-supervised learning approach.
We validate the method through several simulated video processing tasks, where the underlying SDE is known, and through real world datasets.
arXiv Detail & Related papers (2020-07-12T19:46:31Z) - Stochastic Differential Equations with Variational Wishart Diffusions [18.590352916158093]
We present a non-parametric way of inferring differential equations for both regression tasks and continuous-time dynamical modelling.
The work has high emphasis on the part of the differential equation, also known as the diffusion, and modelling it by means of Wishart processes.
arXiv Detail & Related papers (2020-06-26T10:21:35Z) - Learning continuous-time PDEs from sparse data with graph neural
networks [10.259254824702555]
We propose a continuous-time differential model for dynamical systems whose governing equations are parameterized by message passing graph neural networks.
We demonstrate the model's ability to work with unstructured grids, arbitrary time steps, and noisy observations.
We compare our method with existing approaches on several well-known physical systems that involve first and higher-order PDEs with state-of-the-art predictive performance.
arXiv Detail & Related papers (2020-06-16T07:15:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.