Related papers: Bayesian Meta-Reinforcement Learning with Laplace Variational Recurrent Networks

Bayesian Meta-Reinforcement Learning with Laplace Variational Recurrent Networks

URL: http://arxiv.org/abs/2505.18591v1
Date: Sat, 24 May 2025 08:38:10 GMT
Title: Bayesian Meta-Reinforcement Learning with Laplace Variational Recurrent Networks
Authors: Joery A. de Vries, Jinke He, Mathijs M. de Weerdt, Matthijs T. J. Spaan,
Abstract summary: We show how one can augment a point estimate to give full distributions without modifying the base model architecture.<n>Our method performs on par with variational baselines while having much fewer parameters.
Score: 8.73717644648873
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Meta-reinforcement learning trains a single reinforcement learning agent on a distribution of tasks to quickly generalize to new tasks outside of the training set at test time. From a Bayesian perspective, one can interpret this as performing amortized variational inference on the posterior distribution over training tasks. Among the various meta-reinforcement learning approaches, a common method is to represent this distribution with a point-estimate using a recurrent neural network. We show how one can augment this point estimate to give full distributions through the Laplace approximation, either at the start of, during, or after learning, without modifying the base model architecture. With our approximation, we are able to estimate distribution statistics (e.g., the entropy) of non-Bayesian agents and observe that point-estimate based methods produce overconfident estimators while not satisfying consistency. Furthermore, when comparing our approach to full-distribution based learning of the task posterior, our method performs on par with variational baselines while having much fewer parameters.

Related papers

Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data. Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z)
Implicit Variational Inference for High-Dimensional Posteriors [7.924706533725115]
In variational inference, the benefits of Bayesian models rely on accurately capturing the true posterior distribution. We propose using neural samplers that specify implicit distributions, which are well-suited for approximating complex multimodal and correlated posteriors. Our approach introduces novel bounds for approximate inference using implicit distributions by locally linearising the neural sampler.
arXiv Detail & Related papers (2023-10-10T14:06:56Z)
Variational Density Propagation Continual Learning [0.0]
Deep Neural Networks (DNNs) deployed to the real world are regularly subject to out-of-distribution (OoD) data. This paper proposes a framework for adapting to data distribution drift modeled by benchmark Continual Learning datasets.
arXiv Detail & Related papers (2023-08-22T21:51:39Z)
Value-Distributional Model-Based Reinforcement Learning [59.758009422067]
Quantifying uncertainty about a policy's long-term performance is important to solve sequential decision-making tasks. We study the problem from a model-based Bayesian reinforcement learning perspective. We propose Epistemic Quantile-Regression (EQR), a model-based algorithm that learns a value distribution function.
arXiv Detail & Related papers (2023-08-12T14:59:19Z)
Normality-Guided Distributional Reinforcement Learning for Continuous Control [13.818149654692863]
Learning a predictive model of the mean return, or value function, plays a critical role in many reinforcement learning algorithms.<n>We study the value distribution in several continuous control tasks and find that the learned value distribution is empirically quite close to normal.<n>We propose a policy update strategy based on the correctness as measured by structural characteristics of the value distribution not present in the standard value function.
arXiv Detail & Related papers (2022-08-28T02:52:10Z)
Semantic Self-adaptation: Enhancing Generalization with a Single Sample [45.111358665370524]
We propose a self-adaptive approach for semantic segmentation. It fine-tunes the parameters of convolutional layers to the input image using consistency regularization. Our empirical study suggests that self-adaptation may complement the established practice of model regularization at training time.
arXiv Detail & Related papers (2022-08-10T12:29:01Z)
On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task. Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z)
A Variational Bayesian Approach to Learning Latent Variables for Acoustic Knowledge Transfer [55.20627066525205]
We propose a variational Bayesian (VB) approach to learning distributions of latent variables in deep neural network (DNN) models. Our proposed VB approach can obtain good improvements on target devices, and consistently outperforms 13 state-of-the-art knowledge transfer algorithms.
arXiv Detail & Related papers (2021-10-16T15:54:01Z)
Distributional Reinforcement Learning via Moment Matching [54.16108052278444]
We formulate a method that learns a finite set of statistics from each return distribution via neural networks. Our method can be interpreted as implicitly matching all orders of moments between a return distribution and its Bellman target. Experiments on the suite of Atari games show that our method outperforms the standard distributional RL baselines.
arXiv Detail & Related papers (2020-07-24T05:18:17Z)
Covariate Distribution Aware Meta-learning [3.494950334697974]
We propose a computationally feasible meta-learning algorithm by introducing meaningful relaxations. We demonstrate the gains of our algorithm over bootstrapped based meta-learning baselines on popular classification benchmarks.
arXiv Detail & Related papers (2020-07-06T05:00:13Z)
Learning Diverse Representations for Fast Adaptation to Distribution Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task. We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)
Meta Learning as Bayes Risk Minimization [18.76745359031975]
We use a probabilistic framework to formalize what it means for two tasks to be related. In our formulation, the BRM optimal solution is given by the predictive distribution computed from the posterior distribution of the task-specific latent variable conditioned on the contextual dataset. We show that our approximation of the posterior distributions converges to the maximum likelihood estimate with the same rate as the true posterior distribution.
arXiv Detail & Related papers (2020-06-02T09:38:00Z)
A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms [67.67377846416106]
We present a distributional approach to theoretical analyses of reinforcement learning algorithms for constant step-sizes. We show that value-based methods such as TD($lambda$) and $Q$-Learning have update rules which are contractive in the space of distributions of functions.
arXiv Detail & Related papers (2020-03-27T05:13:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.