Model-based Reinforcement Learning for Semi-Markov Decision Processes
with Neural ODEs
- URL: http://arxiv.org/abs/2006.16210v2
- Date: Sun, 25 Oct 2020 05:55:05 GMT
- Title: Model-based Reinforcement Learning for Semi-Markov Decision Processes
with Neural ODEs
- Authors: Jianzhun Du, Joseph Futoma, Finale Doshi-Velez
- Abstract summary: We present two solutions for modeling continuous-time dynamics using neural ordinary differential equations (ODEs)
Our models accurately characterize continuous-time dynamics and enable us to develop high-performing policies using a small amount of data.
We experimentally demonstrate the efficacy of our methods across various continuous-time domains.
- Score: 30.36381338938319
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present two elegant solutions for modeling continuous-time dynamics, in a
novel model-based reinforcement learning (RL) framework for semi-Markov
decision processes (SMDPs), using neural ordinary differential equations
(ODEs). Our models accurately characterize continuous-time dynamics and enable
us to develop high-performing policies using a small amount of data. We also
develop a model-based approach for optimizing time schedules to reduce
interaction rates with the environment while maintaining the near-optimal
performance, which is not possible for model-free methods. We experimentally
demonstrate the efficacy of our methods across various continuous-time domains.
Related papers
- Distributionally Robust Model-based Reinforcement Learning with Large
State Spaces [55.14361269378122]
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment.
We study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets.
We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics.
arXiv Detail & Related papers (2023-09-05T13:42:11Z) - Learning Space-Time Continuous Neural PDEs from Partially Observed
States [13.01244901400942]
We introduce a grid-independent model learning partial differential equations (PDEs) from noisy and partial observations on irregular grids.
We propose a space-time continuous latent neural PDE model with an efficient probabilistic framework and a novel design encoder for improved data efficiency and grid independence.
arXiv Detail & Related papers (2023-07-09T06:53:59Z) - Learning PDE Solution Operator for Continuous Modeling of Time-Series [1.39661494747879]
This work presents a partial differential equation (PDE) based framework which improves the dynamics modeling capability.
We propose a neural operator that can handle time continuously without requiring iterative operations or specific grids of temporal discretization.
Our framework opens up a new way for a continuous representation of neural networks that can be readily adopted for real-world applications.
arXiv Detail & Related papers (2023-02-02T03:47:52Z) - Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning.
We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle.
In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - Homotopy-based training of NeuralODEs for accurate dynamics discovery [0.0]
We develop a new training method for NeuralODEs, based on synchronization and homotopy optimization.
We show that synchronizing the model dynamics and the training data tames the originally irregular loss landscape.
Our method achieves competitive or better training loss while often requiring less than half the number of training epochs.
arXiv Detail & Related papers (2022-10-04T06:32:45Z) - Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers.
We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z) - Model-based Meta Reinforcement Learning using Graph Structured Surrogate
Models [40.08137765886609]
We show that our model, called a graph structured surrogate model (GSSM), outperforms state-of-the-art methods in predicting environment dynamics.
Our approach is able to obtain high returns, while allowing fast execution during deployment by avoiding test time policy gradient optimization.
arXiv Detail & Related papers (2021-02-16T17:21:55Z) - DyNODE: Neural Ordinary Differential Equations for Dynamics Modeling in
Continuous Control [0.0]
We present a novel approach that captures the underlying dynamics of a system by incorporating control in a neural ordinary differential equation framework.
Results indicate that a simple DyNODE architecture when combined with an actor-critic reinforcement learning algorithm outperforms canonical neural networks.
arXiv Detail & Related papers (2020-09-09T12:56:58Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z) - Model-Augmented Actor-Critic: Backpropagating through Paths [81.86992776864729]
Current model-based reinforcement learning approaches use the model simply as a learned black-box simulator.
We show how to make more effective use of the model by exploiting its differentiability.
arXiv Detail & Related papers (2020-05-16T19:18:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.