Dangers of Bayesian Model Averaging under Covariate Shift
- URL: http://arxiv.org/abs/2106.11905v1
- Date: Tue, 22 Jun 2021 16:19:52 GMT
- Title: Dangers of Bayesian Model Averaging under Covariate Shift
- Authors: Pavel Izmailov, Patrick Nicholson, Sanae Lotfi, Andrew Gordon Wilson
- Abstract summary: We show how a Bayesian model average can in fact be problematic under covariate shift.
We additionally show why the same issue does not affect many approximate inference procedures.
- Score: 45.20204749251884
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Approximate Bayesian inference for neural networks is considered a robust
alternative to standard training, often providing good performance on
out-of-distribution data. However, Bayesian neural networks (BNNs) with
high-fidelity approximate inference via full-batch Hamiltonian Monte Carlo
achieve poor generalization under covariate shift, even underperforming
classical estimation. We explain this surprising result, showing how a Bayesian
model average can in fact be problematic under covariate shift, particularly in
cases where linear dependencies in the input features cause a lack of posterior
contraction. We additionally show why the same issue does not affect many
approximate inference procedures, or classical maximum a-posteriori (MAP)
training. Finally, we propose novel priors that improve the robustness of BNNs
to many sources of covariate shift.
Related papers
- Collapsed Inference for Bayesian Deep Learning [36.1725075097107]
We introduce a novel collapsed inference scheme that performs Bayesian model averaging using collapsed samples.
A collapsed sample represents uncountably many models drawn from the approximate posterior.
Our proposed use of collapsed samples achieves a balance between scalability and accuracy.
arXiv Detail & Related papers (2023-06-16T08:34:42Z) - Improved uncertainty quantification for neural networks with Bayesian
last layer [0.0]
Uncertainty quantification is an important task in machine learning.
We present a reformulation of the log-marginal likelihood of a NN with BLL which allows for efficient training using backpropagation.
arXiv Detail & Related papers (2023-02-21T20:23:56Z) - GFlowOut: Dropout with Generative Flow Networks [76.59535235717631]
Monte Carlo Dropout has been widely used as a relatively cheap way for approximate Inference.
Recent works show that the dropout mask can be viewed as a latent variable, which can be inferred with variational inference.
GFlowOutleverages the recently proposed probabilistic framework of Generative Flow Networks (GFlowNets) to learn the posterior distribution over dropout masks.
arXiv Detail & Related papers (2022-10-24T03:00:01Z) - Tackling covariate shift with node-based Bayesian neural networks [26.64657196802115]
Node-based BNNs induce uncertainty by multiplying each hidden node with latent random variables, while learning a point-estimate of the weights.
In this paper, we interpret these latent noise variables as implicit representations of simple and domain-agnostic data perturbations during training.
arXiv Detail & Related papers (2022-06-06T08:56:19Z) - What Are Bayesian Neural Network Posteriors Really Like? [63.950151520585024]
We show that Hamiltonian Monte Carlo can achieve significant performance gains over standard and deep ensembles.
We also show that deep distributions are similarly close to HMC as standard SGLD, and closer than standard variational inference.
arXiv Detail & Related papers (2021-04-29T15:38:46Z) - Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN)
Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one.
We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z) - Investigating maximum likelihood based training of infinite mixtures for
uncertainty quantification [16.30200782698554]
We investigate the effect of training an infinite mixture distribution with the maximum likelihood method instead of variational inference.
We find that the proposed objective leads to adversarial networks with an increased predictive variance.
arXiv Detail & Related papers (2020-08-07T14:55:53Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Bayesian Deep Learning and a Probabilistic Perspective of Generalization [56.69671152009899]
We show that deep ensembles provide an effective mechanism for approximate Bayesian marginalization.
We also propose a related approach that further improves the predictive distribution by marginalizing within basins of attraction.
arXiv Detail & Related papers (2020-02-20T15:13:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.