Why Calibration Error is Wrong Given Model Uncertainty: Using Posterior
Predictive Checks with Deep Learning
- URL: http://arxiv.org/abs/2112.01477v1
- Date: Thu, 2 Dec 2021 18:26:30 GMT
- Title: Why Calibration Error is Wrong Given Model Uncertainty: Using Posterior
Predictive Checks with Deep Learning
- Authors: Achintya Gopal
- Abstract summary: We show how calibration error and its variants are almost always incorrect to use given model uncertainty.
We show how this mistake can lead to trust in bad models and mistrust in good models.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Within the last few years, there has been a move towards using statistical
models in conjunction with neural networks with the end goal of being able to
better answer the question, "what do our models know?". From this trend,
classical metrics such as Prediction Interval Coverage Probability (PICP) and
new metrics such as calibration error have entered the general repertoire of
model evaluation in order to gain better insight into how the uncertainty of
our model compares to reality. One important component of uncertainty modeling
is model uncertainty (epistemic uncertainty), a measurement of what the model
does and does not know. However, current evaluation techniques tends to
conflate model uncertainty with aleatoric uncertainty (irreducible error),
leading to incorrect conclusions. In this paper, using posterior predictive
checks, we show how calibration error and its variants are almost always
incorrect to use given model uncertainty, and further show how this mistake can
lead to trust in bad models and mistrust in good models. Though posterior
predictive checks has often been used for in-sample evaluation of Bayesian
models, we show it still has an important place in the modern deep learning
world.
Related papers
- Selective Learning: Towards Robust Calibration with Dynamic Regularization [79.92633587914659]
Miscalibration in deep learning refers to there is a discrepancy between the predicted confidence and performance.
We introduce Dynamic Regularization (DReg) which aims to learn what should be learned during training thereby circumventing the confidence adjusting trade-off.
arXiv Detail & Related papers (2024-02-13T11:25:20Z) - Parameter uncertainties for imperfect surrogate models in the low-noise regime [0.3069335774032178]
We analyze the generalization error of misspecified, near-deterministic surrogate models.
We show posterior distributions must cover every training point to avoid a divergent generalization error.
This is demonstrated on model problems before application to thousand dimensional datasets in atomistic machine learning.
arXiv Detail & Related papers (2024-02-02T11:41:21Z) - Proximity-Informed Calibration for Deep Neural Networks [49.330703634912915]
ProCal is a plug-and-play algorithm with a theoretical guarantee to adjust sample confidence based on proximity.
We show that ProCal is effective in addressing proximity bias and improving calibration on balanced, long-tail, and distribution-shift settings.
arXiv Detail & Related papers (2023-06-07T16:40:51Z) - The Implicit Delta Method [61.36121543728134]
In this paper, we propose an alternative, the implicit delta method, which works by infinitesimally regularizing the training loss of uncertainty.
We show that the change in the evaluation due to regularization is consistent for the variance of the evaluation estimator, even when the infinitesimal change is approximated by a finite difference.
arXiv Detail & Related papers (2022-11-11T19:34:17Z) - Reliability-Aware Prediction via Uncertainty Learning for Person Image
Retrieval [51.83967175585896]
UAL aims at providing reliability-aware predictions by considering data uncertainty and model uncertainty simultaneously.
Data uncertainty captures the noise" inherent in the sample, while model uncertainty depicts the model's confidence in the sample's prediction.
arXiv Detail & Related papers (2022-10-24T17:53:20Z) - Data Uncertainty without Prediction Models [0.8223798883838329]
We propose an uncertainty estimation method named a Distance-weighted Class Impurity without explicit use of prediction models.
We verified that the Distance-weighted Class Impurity works effectively regardless of prediction models.
arXiv Detail & Related papers (2022-04-25T13:26:06Z) - Uncertainty estimation under model misspecification in neural network
regression [3.2622301272834524]
We study the effect of the model choice on uncertainty estimation.
We highlight that under model misspecification, aleatoric uncertainty is not properly captured.
arXiv Detail & Related papers (2021-11-23T10:18:41Z) - Dense Uncertainty Estimation via an Ensemble-based Conditional Latent
Variable Model [68.34559610536614]
We argue that the aleatoric uncertainty is an inherent attribute of the data and can only be correctly estimated with an unbiased oracle model.
We propose a new sampling and selection strategy at train time to approximate the oracle model for aleatoric uncertainty estimation.
Our results show that our solution achieves both accurate deterministic results and reliable uncertainty estimation.
arXiv Detail & Related papers (2021-11-22T08:54:10Z) - Dense Uncertainty Estimation [62.23555922631451]
In this paper, we investigate neural networks and uncertainty estimation techniques to achieve both accurate deterministic prediction and reliable uncertainty estimation.
We work on two types of uncertainty estimations solutions, namely ensemble based methods and generative model based methods, and explain their pros and cons while using them in fully/semi/weakly-supervised framework.
arXiv Detail & Related papers (2021-10-13T01:23:48Z) - Revisiting the Calibration of Modern Neural Networks [44.26439222399464]
Many instances of miscalibration in modern neural networks have been reported, suggesting a trend that newer, more accurate models produce poorly calibrated predictions.
We systematically relate model calibration and accuracy, and find that the most recent models, notably those not using convolutions, are among the best calibrated.
We also show that model size and amount of pretraining do not fully explain these differences, suggesting that architecture is a major determinant of calibration properties.
arXiv Detail & Related papers (2021-06-15T09:24:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.