ODE-based Recurrent Model-free Reinforcement Learning for POMDPs
- URL: http://arxiv.org/abs/2309.14078v2
- Date: Sun, 29 Oct 2023 12:30:40 GMT
- Title: ODE-based Recurrent Model-free Reinforcement Learning for POMDPs
- Authors: Xuanle Zhao, Duzhen Zhang, Liyuan Han, Tielin Zhang, Bo Xu
- Abstract summary: We present a novel ODE-based recurrent model combines with model-free reinforcement learning framework to solve POMDPs.
We experimentally demonstrate the efficacy of our methods across various PO continuous control and meta-RL tasks.
Our experiments illustrate that our method is robust against irregular observations, owing to the ability of ODEs to model irregularly-sampled time series.
- Score: 15.030970899252601
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural ordinary differential equations (ODEs) are widely recognized as the
standard for modeling physical mechanisms, which help to perform approximate
inference in unknown physical or biological environments. In partially
observable (PO) environments, how to infer unseen information from raw
observations puzzled the agents. By using a recurrent policy with a compact
context, context-based reinforcement learning provides a flexible way to
extract unobservable information from historical transitions. To help the agent
extract more dynamics-related information, we present a novel ODE-based
recurrent model combines with model-free reinforcement learning (RL) framework
to solve partially observable Markov decision processes (POMDPs). We
experimentally demonstrate the efficacy of our methods across various PO
continuous control and meta-RL tasks. Furthermore, our experiments illustrate
that our method is robust against irregular observations, owing to the ability
of ODEs to model irregularly-sampled time series.
Related papers
- On the Trajectory Regularity of ODE-based Diffusion Sampling [79.17334230868693]
Diffusion-based generative models use differential equations to establish a smooth connection between a complex data distribution and a tractable prior distribution.
In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models.
arXiv Detail & Related papers (2024-05-18T15:59:41Z) - Free-Form Variational Inference for Gaussian Process State-Space Models [21.644570034208506]
We propose a new method for inference in Bayesian GPSSMs.
Our method is based on freeform variational inference via inducing Hamiltonian Monte Carlo.
We show that our approach can learn transition dynamics and latent states more accurately than competing methods.
arXiv Detail & Related papers (2023-02-20T11:34:16Z) - Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning.
We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle.
In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z) - Using scientific machine learning for experimental bifurcation analysis
of dynamic systems [2.204918347869259]
This study focuses on training universal differential equation (UDE) models for physical nonlinear dynamical systems with limit cycles.
We consider examples where training data is generated by numerical simulations, whereas we also employ the proposed modelling concept to physical experiments.
We use both neural networks and Gaussian processes as universal approximators alongside the mechanistic models to give a critical assessment of the accuracy and robustness of the UDE modelling approach.
arXiv Detail & Related papers (2021-10-22T15:43:03Z) - Provable RL with Exogenous Distractors via Multistep Inverse Dynamics [85.52408288789164]
Real-world applications of reinforcement learning (RL) require the agent to deal with high-dimensional observations such as those generated from a megapixel camera.
Prior work has addressed such problems with representation learning, through which the agent can provably extract endogenous, latent state information from raw observations.
However, such approaches can fail in the presence of temporally correlated noise in the observations.
arXiv Detail & Related papers (2021-10-17T15:21:27Z) - Meta-learning using privileged information for dynamics [66.32254395574994]
We extend the Neural ODE Process model to use additional information within the Learning Using Privileged Information setting.
We validate our extension with experiments showing improved accuracy and calibration on simulated dynamics tasks.
arXiv Detail & Related papers (2021-04-29T12:18:02Z) - Leveraging Global Parameters for Flow-based Neural Posterior Estimation [90.21090932619695]
Inferring the parameters of a model based on experimental observations is central to the scientific method.
A particularly challenging setting is when the model is strongly indeterminate, i.e., when distinct sets of parameters yield identical observations.
We present a method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters.
arXiv Detail & Related papers (2021-02-12T12:23:13Z) - Continuous-Time Model-Based Reinforcement Learning [4.427447378048202]
We propose a continuous-time MBRL framework based on a novel actor-critic method.
We implement and test our method on a new ODE-RL suite that explicitly solves continuous-time control systems.
arXiv Detail & Related papers (2021-02-09T11:30:19Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Model-based Reinforcement Learning for Semi-Markov Decision Processes
with Neural ODEs [30.36381338938319]
We present two solutions for modeling continuous-time dynamics using neural ordinary differential equations (ODEs)
Our models accurately characterize continuous-time dynamics and enable us to develop high-performing policies using a small amount of data.
We experimentally demonstrate the efficacy of our methods across various continuous-time domains.
arXiv Detail & Related papers (2020-06-29T17:21:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.