Efficient Exploration in Continuous-time Model-based Reinforcement
Learning
- URL: http://arxiv.org/abs/2310.19848v1
- Date: Mon, 30 Oct 2023 15:04:40 GMT
- Title: Efficient Exploration in Continuous-time Model-based Reinforcement
Learning
- Authors: Lenart Treven, Jonas H\"ubotter, Bhavya Sukhija, Florian D\"orfler,
Andreas Krause
- Abstract summary: Reinforcement learning algorithms typically consider discrete-time dynamics, even though the underlying systems are often continuous in time.
We introduce a model-based reinforcement learning algorithm that represents continuous-time dynamics.
- Score: 37.14026153342745
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning algorithms typically consider discrete-time dynamics,
even though the underlying systems are often continuous in time. In this paper,
we introduce a model-based reinforcement learning algorithm that represents
continuous-time dynamics using nonlinear ordinary differential equations
(ODEs). We capture epistemic uncertainty using well-calibrated probabilistic
models, and use the optimistic principle for exploration. Our regret bounds
surface the importance of the measurement selection strategy(MSS), since in
continuous time we not only must decide how to explore, but also when to
observe the underlying system. Our analysis demonstrates that the regret is
sublinear when modeling ODEs with Gaussian Processes (GP) for common choices of
MSS, such as equidistant sampling. Additionally, we propose an adaptive,
data-dependent, practical MSS that, when combined with GP dynamics, also
achieves sublinear regret with significantly fewer samples. We showcase the
benefits of continuous-time modeling over its discrete-time counterpart, as
well as our proposed adaptive MSS over standard baselines, on several
applications.
Related papers
- Recursive Learning of Asymptotic Variational Objectives [49.69399307452126]
General state-space models (SSMs) are widely used in statistical machine learning and are among the most classical generative models for sequential time-series data.
Online sequential IWAE (OSIWAE) allows for online learning of both model parameters and a Markovian recognition model for inferring latent states.
This approach is more theoretically well-founded than recently proposed online variational SMC methods.
arXiv Detail & Related papers (2024-11-04T16:12:37Z) - A Poisson-Gamma Dynamic Factor Model with Time-Varying Transition Dynamics [51.147876395589925]
A non-stationary PGDS is proposed to allow the underlying transition matrices to evolve over time.
A fully-conjugate and efficient Gibbs sampler is developed to perform posterior simulation.
Experiments show that, in comparison with related models, the proposed non-stationary PGDS achieves improved predictive performance.
arXiv Detail & Related papers (2024-02-26T04:39:01Z) - Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning [5.8184275610981615]
We use entropy to measure the cost of exploration to derive the optimal investment strategy.
We design the corresponding reinforcement learning algorithm.
Our model exhibits better applicability when analyzing real-world data than the continuous-time model.
arXiv Detail & Related papers (2023-12-24T02:08:49Z) - Exact Inference for Continuous-Time Gaussian Process Dynamics [6.941863788146731]
In practice, the true system is often unknown and has to be learned from measurement data.
Most methods in Gaussian process (GP) dynamics model learning are trained on one-step ahead predictions.
We show how to derive flexible inference schemes for these types of evaluations.
arXiv Detail & Related papers (2023-09-05T16:07:00Z) - OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive
Learning [67.07363529640784]
We propose OpenSTL to categorize prevalent approaches into recurrent-based and recurrent-free models.
We conduct standard evaluations on datasets across various domains, including synthetic moving object trajectory, human motion, driving scenes, traffic flow and forecasting weather.
We find that recurrent-free models achieve a good balance between efficiency and performance than recurrent models.
arXiv Detail & Related papers (2023-06-20T03:02:14Z) - Learning the Dynamics of Sparsely Observed Interacting Systems [0.6021787236982659]
We address the problem of learning the dynamics of an unknown non-parametric system linking a target and a feature time series.
By leveraging the rich theory of signatures, we are able to cast this non-linear problem as a high-dimensional linear regression.
arXiv Detail & Related papers (2023-01-27T10:48:28Z) - Markov Chain Monte Carlo for Continuous-Time Switching Dynamical Systems [26.744964200606784]
We propose a novel inference algorithm utilizing a Markov Chain Monte Carlo approach.
The presented Gibbs sampler allows to efficiently obtain samples from the exact continuous-time posterior processes.
arXiv Detail & Related papers (2022-05-18T09:03:00Z) - Learning continuous models for continuous physics [94.42705784823997]
We develop a test based on numerical analysis theory to validate machine learning models for science and engineering applications.
Our results illustrate how principled numerical analysis methods can be coupled with existing ML training/testing methodologies to validate models for science and engineering applications.
arXiv Detail & Related papers (2022-02-17T07:56:46Z) - Deep Efficient Continuous Manifold Learning for Time Series Modeling [11.876985348588477]
A symmetric positive definite matrix is being studied in computer vision, signal processing, and medical image analysis.
In this paper, we propose a framework to exploit a diffeomorphism mapping between Riemannian manifold and a Cholesky space.
For dynamic modeling of time-series data, we devise a continuous manifold learning method by systematically integrating a manifold ordinary differential equation and a gated recurrent neural network.
arXiv Detail & Related papers (2021-12-03T01:38:38Z) - Consistency of mechanistic causal discovery in continuous-time using
Neural ODEs [85.7910042199734]
We consider causal discovery in continuous-time for the study of dynamical systems.
We propose a causal discovery algorithm based on penalized Neural ODEs.
arXiv Detail & Related papers (2021-05-06T08:48:02Z) - Stochastically forced ensemble dynamic mode decomposition for
forecasting and analysis of near-periodic systems [65.44033635330604]
We introduce a novel load forecasting method in which observed dynamics are modeled as a forced linear system.
We show that its use of intrinsic linear dynamics offers a number of desirable properties in terms of interpretability and parsimony.
Results are presented for a test case using load data from an electrical grid.
arXiv Detail & Related papers (2020-10-08T20:25:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.