Can a Confident Prior Replace a Cold Posterior?
- URL: http://arxiv.org/abs/2403.01272v1
- Date: Sat, 2 Mar 2024 17:28:55 GMT
- Title: Can a Confident Prior Replace a Cold Posterior?
- Authors: Martin Marek, Brooks Paige, Pavel Izmailov
- Abstract summary: We introduce a "DirClip" prior that is practical to sample and nearly matches the performance of a cold posterior.
Second, we introduce a "confidence prior" that directly approximates a cold likelihood in the limit of decreasing temperature but cannot be easily sampled.
- Score: 20.018444020989712
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Benchmark datasets used for image classification tend to have very low levels
of label noise. When Bayesian neural networks are trained on these datasets,
they often underfit, misrepresenting the aleatoric uncertainty of the data. A
common solution is to cool the posterior, which improves fit to the training
data but is challenging to interpret from a Bayesian perspective. We explore
whether posterior tempering can be replaced by a confidence-inducing prior
distribution. First, we introduce a "DirClip" prior that is practical to sample
and nearly matches the performance of a cold posterior. Second, we introduce a
"confidence prior" that directly approximates a cold likelihood in the limit of
decreasing temperature but cannot be easily sampled. Lastly, we provide several
general insights into confidence-inducing priors, such as when they might
diverge and how fine-tuning can mitigate numerical instability.
Related papers
- On Cold Posteriors of Probabilistic Neural Networks: Understanding the Cold Posterior Effect and A New Way to Learn Cold Posteriors with Tight Generalization Guarantees [4.532517021515833]
In Bayesian deep learning, neural network weights are treated as random variables with prior distributions.
PAC-Bayesian analysis offers a frequentist framework to derive generalization bounds for randomized predictors.
By balancing the influence of observed data and prior regularization, temperature adjustments can address issues of underfitting or overfitting in Bayesian models.
arXiv Detail & Related papers (2024-10-20T06:40:35Z) - Calibrating Neural Simulation-Based Inference with Differentiable
Coverage Probability [50.44439018155837]
We propose to include a calibration term directly into the training objective of the neural model.
By introducing a relaxation of the classical formulation of calibration error we enable end-to-end backpropagation.
It is directly applicable to existing computational pipelines allowing reliable black-box posterior inference.
arXiv Detail & Related papers (2023-10-20T10:20:45Z) - The fine print on tempered posteriors [4.503508912578133]
We conduct a detailed investigation of tempered posteriors and uncover a number of crucial and previously unspecified points.
Contrary to previous works, we finally show through a PAC-Bayesian analysis that the temperature $lambda$ cannot be seen as simply fixing a misdiscussed prior or likelihood.
arXiv Detail & Related papers (2023-09-11T08:21:42Z) - Sample-dependent Adaptive Temperature Scaling for Improved Calibration [95.7477042886242]
Post-hoc approach to compensate for neural networks being wrong is to perform temperature scaling.
We propose to predict a different temperature value for each input, allowing us to adjust the mismatch between confidence and accuracy.
We test our method on the ResNet50 and WideResNet28-10 architectures using the CIFAR10/100 and Tiny-ImageNet datasets.
arXiv Detail & Related papers (2022-07-13T14:13:49Z) - On Uncertainty, Tempering, and Data Augmentation in Bayesian
Classification [47.13680267076843]
We show that explicitly accounting for aleatoric uncertainty significantly improves the performance of Bayesian neural networks.
We find that a cold posterior, tempered by a power greater than one, often more honestly reflects our beliefs about aleatoric uncertainty than no tempering.
arXiv Detail & Related papers (2022-03-30T17:17:50Z) - Posterior temperature optimized Bayesian models for inverse problems in
medical imaging [59.82184400837329]
We present an unsupervised Bayesian approach to inverse problems in medical imaging using mean-field variational inference with a fully tempered posterior.
We show that an optimized posterior temperature leads to improved accuracy and uncertainty estimation.
Our source code is publicly available at calibrated.com/Cardio-AI/mfvi-dip-mia.
arXiv Detail & Related papers (2022-02-02T12:16:33Z) - Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs)
PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors.
We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z) - Data augmentation in Bayesian neural networks and the cold posterior
effect [28.10908356388375]
We show how to find a log-likelihood for augmented datasets.
Our approach prescribes augmenting the same underlying image multiple times, both at test and train-time, and averaging either the logits or the predictive probabilities.
While there are interactions with the cold posterior effect, neither averaging logits or averaging probabilities eliminates it.
arXiv Detail & Related papers (2021-06-10T08:39:10Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - A statistical theory of cold posteriors in deep neural networks [32.45282187405337]
We show that BNNs for image classification use the wrong likelihood.
In particular, standard image benchmark datasets such as CIFAR-10 are carefully curated.
arXiv Detail & Related papers (2020-08-13T13:46:58Z) - Cold Posteriors and Aleatoric Uncertainty [32.341379426923105]
Recent work has observed that one can outperform exact inference in Bayesian neural networks by tuning the "temperature" of the posterior on a validation set.
We argue that commonly used priors can significantly overestimate the aleatoric uncertainty in the labels on many classification datasets.
arXiv Detail & Related papers (2020-07-31T18:37:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.