Cold Posteriors and Aleatoric Uncertainty
- URL: http://arxiv.org/abs/2008.00029v1
- Date: Fri, 31 Jul 2020 18:37:31 GMT
- Title: Cold Posteriors and Aleatoric Uncertainty
- Authors: Ben Adlam, Jasper Snoek, and Samuel L. Smith
- Abstract summary: Recent work has observed that one can outperform exact inference in Bayesian neural networks by tuning the "temperature" of the posterior on a validation set.
We argue that commonly used priors can significantly overestimate the aleatoric uncertainty in the labels on many classification datasets.
- Score: 32.341379426923105
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent work has observed that one can outperform exact inference in Bayesian
neural networks by tuning the "temperature" of the posterior on a validation
set (the "cold posterior" effect). To help interpret this phenomenon, we argue
that commonly used priors in Bayesian neural networks can significantly
overestimate the aleatoric uncertainty in the labels on many classification
datasets. This problem is particularly pronounced in academic benchmarks like
MNIST or CIFAR, for which the quality of the labels is high. For the special
case of Gaussian process regression, any positive temperature corresponds to a
valid posterior under a modified prior, and tuning this temperature is directly
analogous to empirical Bayes. On classification tasks, there is no direct
equivalence between modifying the prior and tuning the temperature, however
reducing the temperature can lead to models which better reflect our belief
that one gains little information by relabeling existing examples in the
training set. Therefore although cold posteriors do not always correspond to an
exact inference procedure, we believe they may often better reflect our true
prior beliefs.
Related papers
- Unrolled denoising networks provably learn optimal Bayesian inference [54.79172096306631]
We prove the first rigorous learning guarantees for neural networks based on unrolling approximate message passing (AMP)
For compressed sensing, we prove that when trained on data drawn from a product prior, the layers of the network converge to the same denoisers used in Bayes AMP.
arXiv Detail & Related papers (2024-09-19T17:56:16Z) - Can a Confident Prior Replace a Cold Posterior? [20.018444020989712]
We introduce a "DirClip" prior that is practical to sample and nearly matches the performance of a cold posterior.
Second, we introduce a "confidence prior" that directly approximates a cold likelihood in the limit of decreasing temperature but cannot be easily sampled.
arXiv Detail & Related papers (2024-03-02T17:28:55Z) - The fine print on tempered posteriors [4.503508912578133]
We conduct a detailed investigation of tempered posteriors and uncover a number of crucial and previously unspecified points.
Contrary to previous works, we finally show through a PAC-Bayesian analysis that the temperature $lambda$ cannot be seen as simply fixing a misdiscussed prior or likelihood.
arXiv Detail & Related papers (2023-09-11T08:21:42Z) - On the Limitations of Temperature Scaling for Distributions with
Overlaps [8.486166869140929]
We show that for empirical risk minimizers for a general set of distributions, the performance of temperature scaling degrades with the amount of overlap between classes.
We prove that optimizing a modified form of the empirical risk induced by the Mixup data augmentation technique can in fact lead to reasonably good calibration performance.
arXiv Detail & Related papers (2023-06-01T14:35:28Z) - Sample-dependent Adaptive Temperature Scaling for Improved Calibration [95.7477042886242]
Post-hoc approach to compensate for neural networks being wrong is to perform temperature scaling.
We propose to predict a different temperature value for each input, allowing us to adjust the mismatch between confidence and accuracy.
We test our method on the ResNet50 and WideResNet28-10 architectures using the CIFAR10/100 and Tiny-ImageNet datasets.
arXiv Detail & Related papers (2022-07-13T14:13:49Z) - On Uncertainty, Tempering, and Data Augmentation in Bayesian
Classification [47.13680267076843]
We show that explicitly accounting for aleatoric uncertainty significantly improves the performance of Bayesian neural networks.
We find that a cold posterior, tempered by a power greater than one, often more honestly reflects our beliefs about aleatoric uncertainty than no tempering.
arXiv Detail & Related papers (2022-03-30T17:17:50Z) - Posterior temperature optimized Bayesian models for inverse problems in
medical imaging [59.82184400837329]
We present an unsupervised Bayesian approach to inverse problems in medical imaging using mean-field variational inference with a fully tempered posterior.
We show that an optimized posterior temperature leads to improved accuracy and uncertainty estimation.
Our source code is publicly available at calibrated.com/Cardio-AI/mfvi-dip-mia.
arXiv Detail & Related papers (2022-02-02T12:16:33Z) - Posterior Temperature Optimization in Variational Inference [69.50862982117127]
Cold posteriors have been reported to perform better in practice in the context of deep learning.
In this work, we first derive the ELBO for a fully tempered posterior in mean-field variational inference.
We then use Bayesian optimization to automatically find the optimal posterior temperature.
arXiv Detail & Related papers (2021-06-11T13:01:28Z) - Exploring the Uncertainty Properties of Neural Networks' Implicit Priors
in the Infinite-Width Limit [47.324627920761685]
We use recent theoretical advances that characterize the function-space prior to an ensemble of infinitely-wide NNs as a Gaussian process.
This gives us a better understanding of the implicit prior NNs place on function space.
We also examine the calibration of previous approaches to classification with the NNGP.
arXiv Detail & Related papers (2020-10-14T18:41:54Z) - A statistical theory of cold posteriors in deep neural networks [32.45282187405337]
We show that BNNs for image classification use the wrong likelihood.
In particular, standard image benchmark datasets such as CIFAR-10 are carefully curated.
arXiv Detail & Related papers (2020-08-13T13:46:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.