A Kernel-Based Approach to Non-Stationary Reinforcement Learning in
Metric Spaces
- URL: http://arxiv.org/abs/2007.05078v2
- Date: Wed, 23 Mar 2022 20:21:47 GMT
- Title: A Kernel-Based Approach to Non-Stationary Reinforcement Learning in
Metric Spaces
- Authors: Omar Darwiche Domingues, Pierre M\'enard, Matteo Pirotta, Emilie
Kaufmann, Michal Valko
- Abstract summary: KeRNS is an algorithm for episodic reinforcement learning in non-stationary Markov Decision Processes.
We prove a regret bound that scales with the covering dimension of the state-action space and the total variation of the MDP with time.
- Score: 53.47210316424326
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we propose KeRNS: an algorithm for episodic reinforcement
learning in non-stationary Markov Decision Processes (MDPs) whose state-action
set is endowed with a metric. Using a non-parametric model of the MDP built
with time-dependent kernels, we prove a regret bound that scales with the
covering dimension of the state-action space and the total variation of the MDP
with time, which quantifies its level of non-stationarity. Our method
generalizes previous approaches based on sliding windows and exponential
discounting used to handle changing environments. We further propose a
practical implementation of KeRNS, we analyze its regret and validate it
experimentally.
Related papers
- Fast Value Tracking for Deep Reinforcement Learning [7.648784748888187]
Reinforcement learning (RL) tackles sequential decision-making problems by creating agents that interact with their environment.
Existing algorithms often view these problem as static, focusing on point estimates for model parameters to maximize expected rewards.
Our research leverages the Kalman paradigm to introduce a novel quantification and sampling algorithm called Langevinized Kalman TemporalTD.
arXiv Detail & Related papers (2024-03-19T22:18:19Z) - Free-Form Variational Inference for Gaussian Process State-Space Models [21.644570034208506]
We propose a new method for inference in Bayesian GPSSMs.
Our method is based on freeform variational inference via inducing Hamiltonian Monte Carlo.
We show that our approach can learn transition dynamics and latent states more accurately than competing methods.
arXiv Detail & Related papers (2023-02-20T11:34:16Z) - STEERING: Stein Information Directed Exploration for Model-Based
Reinforcement Learning [111.75423966239092]
We propose an exploration incentive in terms of the integral probability metric (IPM) between a current estimate of the transition model and the unknown optimal.
Based on KSD, we develop a novel algorithm algo: textbfSTEin information dirtextbfEcted exploration for model-based textbfReinforcement LearntextbfING.
arXiv Detail & Related papers (2023-01-28T00:49:28Z) - FaDIn: Fast Discretized Inference for Hawkes Processes with General
Parametric Kernels [82.53569355337586]
This work offers an efficient solution to temporal point processes inference using general parametric kernels with finite support.
The method's effectiveness is evaluated by modeling the occurrence of stimuli-induced patterns from brain signals recorded with magnetoencephalography (MEG)
Results show that the proposed approach leads to an improved estimation of pattern latency than the state-of-the-art.
arXiv Detail & Related papers (2022-10-10T12:35:02Z) - Posterior Coreset Construction with Kernelized Stein Discrepancy for
Model-Based Reinforcement Learning [78.30395044401321]
We develop a novel model-based approach to reinforcement learning (MBRL)
It relaxes the assumptions on the target transition model to belong to a generic family of mixture models.
It can achieve up-to 50 percent reduction in wall clock time in some continuous control environments.
arXiv Detail & Related papers (2022-06-02T17:27:49Z) - Reinforcement Learning with a Terminator [80.34572413850186]
We learn the parameters of the TerMDP and leverage the structure of the estimation problem to provide state-wise confidence bounds.
We use these to construct a provably-efficient algorithm, which accounts for termination, and bound its regret.
arXiv Detail & Related papers (2022-05-30T18:40:28Z) - Expert-Guided Symmetry Detection in Markov Decision Processes [0.0]
We propose a paradigm that aims to detect the presence of some transformations of the state-action space for which the MDP dynamics is invariant.
The results show that the model distributional shift is reduced when the dataset is augmented with the data obtained by using the detected symmetries.
arXiv Detail & Related papers (2021-11-19T16:12:30Z) - Variational Inference for Continuous-Time Switching Dynamical Systems [29.984955043675157]
We present a model based on an Markov jump process modulating a subordinated diffusion process.
We develop a new continuous-time variational inference algorithm.
We extensively evaluate our algorithm under the model assumption and for real-world examples.
arXiv Detail & Related papers (2021-09-29T15:19:51Z) - Continuous-Time Model-Based Reinforcement Learning [4.427447378048202]
We propose a continuous-time MBRL framework based on a novel actor-critic method.
We implement and test our method on a new ODE-RL suite that explicitly solves continuous-time control systems.
arXiv Detail & Related papers (2021-02-09T11:30:19Z) - Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems.
Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions.
We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.