Related papers: Priors Matter: Addressing Misspecification in Bayesian Deep Q-Learning

Priors Matter: Addressing Misspecification in Bayesian Deep Q-Learning

URL: http://arxiv.org/abs/2508.21488v1
Date: Fri, 29 Aug 2025 10:12:42 GMT
Title: Priors Matter: Addressing Misspecification in Bayesian Deep Q-Learning
Authors: Pascal R. van der Vaart, Neil Yorke-Smith, Matthijs T. J. Spaan,
Abstract summary: We demonstrate that there is a cold posterior effect in Bayesian deep Q-learning.<n>We show through statistical tests that the common Gaussian likelihood assumption is frequently violated.<n>We argue that developing more suitable likelihoods and priors should be a key focus in future Bayesian reinforcement learning research.
Score: 12.02900930453346
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Uncertainty quantification in reinforcement learning can greatly improve exploration and robustness. Approximate Bayesian approaches have recently been popularized to quantify uncertainty in model-free algorithms. However, so far the focus has been on improving the accuracy of the posterior approximation, instead of studying the accuracy of the prior and likelihood assumptions underlying the posterior. In this work, we demonstrate that there is a cold posterior effect in Bayesian deep Q-learning, where contrary to theory, performance increases when reducing the temperature of the posterior. To identify and overcome likely causes, we challenge common assumptions made on the likelihood and priors in Bayesian model-free algorithms. We empirically study prior distributions and show through statistical tests that the common Gaussian likelihood assumption is frequently violated. We argue that developing more suitable likelihoods and priors should be a key focus in future Bayesian reinforcement learning research and we offer simple, implementable solutions for better priors in deep Q-learning that lead to more performant Bayesian algorithms.

Related papers

In-Context Parametric Inference: Point or Distribution Estimators? [66.22308335324239]
We show that amortized point estimators generally outperform posterior inference, though the latter remain competitive in some low-dimensional problems.<n>Our experiments indicate that amortized point estimators generally outperform posterior inference, though the latter remain competitive in some low-dimensional problems.
arXiv Detail & Related papers (2025-02-17T10:00:24Z)
Unrolled denoising networks provably learn optimal Bayesian inference [54.79172096306631]
We prove the first rigorous learning guarantees for neural networks based on unrolling approximate message passing (AMP) For compressed sensing, we prove that when trained on data drawn from a product prior, the layers of the network converge to the same denoisers used in Bayes AMP.
arXiv Detail & Related papers (2024-09-19T17:56:16Z)
Misclassification bounds for PAC-Bayesian sparse deep learning [0.0]
We present theoretical results on the prediction or misclassification error of a probabilistic approach utilizing Spike-and-Slab priors for sparse deep learning in classification. We demonstrate that our results can achieve minimax optimal rates in both low and high-dimensional settings, up to a logarithmic factor.
arXiv Detail & Related papers (2024-05-02T14:11:48Z)
Time-Varying Gaussian Process Bandits with Unknown Prior [18.93478528448966]
PE-GP-UCB is capable of solving time-varying Bayesian optimisation problems.<n>It relies on the fact that either the observed function values are consistent with some of the priors.
arXiv Detail & Related papers (2024-02-02T18:52:16Z)
Calibrating Neural Simulation-Based Inference with Differentiable Coverage Probability [50.44439018155837]
We propose to include a calibration term directly into the training objective of the neural model. By introducing a relaxation of the classical formulation of calibration error we enable end-to-end backpropagation. It is directly applicable to existing computational pipelines allowing reliable black-box posterior inference.
arXiv Detail & Related papers (2023-10-20T10:20:45Z)
Posterior temperature optimized Bayesian models for inverse problems in medical imaging [59.82184400837329]
We present an unsupervised Bayesian approach to inverse problems in medical imaging using mean-field variational inference with a fully tempered posterior. We show that an optimized posterior temperature leads to improved accuracy and uncertainty estimation. Our source code is publicly available at calibrated.com/Cardio-AI/mfvi-dip-mia.
arXiv Detail & Related papers (2022-02-02T12:16:33Z)
Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks [65.24701908364383]
We show that a sufficient condition for a uncertainty on a ReLU network is "to be a bit Bayesian calibrated" We further validate these findings empirically via various standard experiments using common deep ReLU networks and Laplace approximations.
arXiv Detail & Related papers (2020-02-24T08:52:06Z)
Bayesian Deep Learning and a Probabilistic Perspective of Generalization [56.69671152009899]
We show that deep ensembles provide an effective mechanism for approximate Bayesian marginalization. We also propose a related approach that further improves the predictive distribution by marginalizing within basins of attraction.
arXiv Detail & Related papers (2020-02-20T15:13:27Z)
How Good is the Bayes Posterior in Deep Neural Networks Really? [46.66866466260469]
We cast doubt on the current understanding of Bayes posteriors in popular deep neural networks. We demonstrate through careful MCMC sampling that the posterior predictive induced by the Bayes posterior yields systematically worse predictions. We put forward several hypotheses that could explain cold posteriors and evaluate the hypotheses through experiments.
arXiv Detail & Related papers (2020-02-06T17:38:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.