Posterior Refinement Improves Sample Efficiency in Bayesian Neural
Networks
- URL: http://arxiv.org/abs/2205.10041v1
- Date: Fri, 20 May 2022 09:24:39 GMT
- Title: Posterior Refinement Improves Sample Efficiency in Bayesian Neural
Networks
- Authors: Agustinus Kristiadi and Runa Eschenhagen and Philipp Hennig
- Abstract summary: We experimentally show that the key to good MC-approximated predictive distributions is the quality of the approximate posterior itself.
We show that the resulting posterior approximation is competitive with even the gold-standard full-batch Hamiltonian Monte Carlo.
- Score: 27.11052209129402
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Monte Carlo (MC) integration is the de facto method for approximating the
predictive distribution of Bayesian neural networks (BNNs). But, even with many
MC samples, Gaussian-based BNNs could still yield bad predictive performance
due to the posterior approximation's error. Meanwhile, alternatives to MC
integration tend to be more expensive or biased. In this work, we
experimentally show that the key to good MC-approximated predictive
distributions is the quality of the approximate posterior itself. However,
previous methods for obtaining accurate posterior approximations are expensive
and non-trivial to implement. We, therefore, propose to refine Gaussian
approximate posteriors with normalizing flows. When applied to last-layer BNNs,
it yields a simple \emph{post hoc} method for improving pre-existing parametric
approximations. We show that the resulting posterior approximation is
competitive with even the gold-standard full-batch Hamiltonian Monte Carlo.
Related papers
- Towards Practical Preferential Bayesian Optimization with Skew Gaussian
Processes [8.198195852439946]
We study preferential Bayesian optimization (BO) where reliable feedback is limited to pairwise comparison called duels.
An important challenge in preferential BO, which uses the preferential Gaussian process (GP) model to represent flexible preference structure, is that the posterior distribution is a computationally intractable skew GP.
We develop a new method that achieves both high computational efficiency and low sample complexity, and then demonstrate its effectiveness through extensive numerical experiments.
arXiv Detail & Related papers (2023-02-03T03:02:38Z) - Langevin Monte Carlo for Contextual Bandits [72.00524614312002]
Langevin Monte Carlo Thompson Sampling (LMC-TS) is proposed to directly sample from the posterior distribution in contextual bandits.
We prove that the proposed algorithm achieves the same sublinear regret bound as the best Thompson sampling algorithms for a special case of contextual bandits.
arXiv Detail & Related papers (2022-06-22T17:58:23Z) - Dangers of Bayesian Model Averaging under Covariate Shift [45.20204749251884]
We show how a Bayesian model average can in fact be problematic under covariate shift.
We additionally show why the same issue does not affect many approximate inference procedures.
arXiv Detail & Related papers (2021-06-22T16:19:52Z) - What Are Bayesian Neural Network Posteriors Really Like? [63.950151520585024]
We show that Hamiltonian Monte Carlo can achieve significant performance gains over standard and deep ensembles.
We also show that deep distributions are similarly close to HMC as standard SGLD, and closer than standard variational inference.
arXiv Detail & Related papers (2021-04-29T15:38:46Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - Variational Laplace for Bayesian neural networks [25.055754094939527]
Variational Laplace exploits a local approximation of the likelihood to estimate the ELBO without the need for sampling the neural-network weights.
We show that early-stopping can be avoided by increasing the learning rate for the variance parameters.
arXiv Detail & Related papers (2021-02-27T14:06:29Z) - Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN)
Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one.
We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z) - Efficiently Sampling Functions from Gaussian Process Posteriors [76.94808614373609]
We propose an easy-to-use and general-purpose approach for fast posterior sampling.
We demonstrate how decoupled sample paths accurately represent Gaussian process posteriors at a fraction of the usual cost.
arXiv Detail & Related papers (2020-02-21T14:03:16Z) - Bayesian Deep Learning and a Probabilistic Perspective of Generalization [56.69671152009899]
We show that deep ensembles provide an effective mechanism for approximate Bayesian marginalization.
We also propose a related approach that further improves the predictive distribution by marginalizing within basins of attraction.
arXiv Detail & Related papers (2020-02-20T15:13:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.