Calibrated and uncertain? Evaluating uncertainty estimates in binary classification models
- URL: http://arxiv.org/abs/2508.11460v1
- Date: Fri, 15 Aug 2025 13:17:32 GMT
- Title: Calibrated and uncertain? Evaluating uncertainty estimates in binary classification models
- Authors: Aurora Grefsrud, Nello Blaser, Trygve Buanes,
- Abstract summary: Rigorous statistical methods underpin the validity of scientific discovery, especially in the natural sciences.<n>We use the unifying framework of approximate Bayesian inference combined with empirical tests on carefully created synthetic classification datasets.<n>We check if the algorithms produce uncertainty estimates which reflect commonly desired properties, such as being well calibrated and exhibiting an increase in uncertainty for out-of-distribution data points.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Rigorous statistical methods, including parameter estimation with accompanying uncertainties, underpin the validity of scientific discovery, especially in the natural sciences. With increasingly complex data models such as deep learning techniques, uncertainty quantification has become exceedingly difficult and a plethora of techniques have been proposed. In this case study, we use the unifying framework of approximate Bayesian inference combined with empirical tests on carefully created synthetic classification datasets to investigate qualitative properties of six different probabilistic machine learning algorithms for class probability and uncertainty estimation: (i) a neural network ensemble, (ii) neural network ensemble with conflictual loss, (iii) evidential deep learning, (iv) a single neural network with Monte Carlo Dropout, (v) Gaussian process classification and (vi) a Dirichlet process mixture model. We check if the algorithms produce uncertainty estimates which reflect commonly desired properties, such as being well calibrated and exhibiting an increase in uncertainty for out-of-distribution data points. Our results indicate that all algorithms are well calibrated, but none of the deep learning based algorithms provide uncertainties that consistently reflect lack of experimental evidence for out-of-distribution data points. We hope our study may serve as a clarifying example for researchers developing new methods of uncertainty estimation for scientific data-driven modeling.
Related papers
- Cooperative Bayesian and variance networks disentangle aleatoric and epistemic uncertainties [0.0]
Real-world data contains aleatoric uncertainty - irreducible noise arising from imperfect measurements or from incomplete knowledge about the data generation process.<n>Mean variance estimation (MVE) networks can learn this type of uncertainty but require ad-hoc regularization strategies to avoid overfitting.<n>We propose to train a variance network with a Bayesian neural network and demonstrate that the resulting model disentangles aleatoric and epistemic uncertainties while improving the mean estimation.
arXiv Detail & Related papers (2025-05-05T15:50:52Z) - Achieving Well-Informed Decision-Making in Drug Discovery: A Comprehensive Calibration Study using Neural Network-Based Structure-Activity Models [4.619907534483781]
computational models that predict drug-target interactions are valuable tools to accelerate the development of new therapeutic agents.
However, such models can be poorly calibrated, which results in unreliable uncertainty estimates.
We show that combining post hoc calibration method with well-performing uncertainty quantification approaches can boost model accuracy and calibration.
arXiv Detail & Related papers (2024-07-19T10:29:00Z) - Evidential Deep Learning: Enhancing Predictive Uncertainty Estimation
for Earth System Science Applications [0.32302664881848275]
Evidential deep learning is a technique that extends parametric deep learning to higher-order distributions.
This study compares the uncertainty derived from evidential neural networks to those obtained from ensembles.
We show evidential deep learning models attaining predictive accuracy rivaling standard methods, while robustly quantifying both sources of uncertainty.
arXiv Detail & Related papers (2023-09-22T23:04:51Z) - Toward Robust Uncertainty Estimation with Random Activation Functions [3.0586855806896045]
We propose a novel approach for uncertainty quantification via ensembles, called Random Activation Functions (RAFs) Ensemble.
RAFs Ensemble outperforms state-of-the-art ensemble uncertainty quantification methods on both synthetic and real-world datasets.
arXiv Detail & Related papers (2023-02-28T13:17:56Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Investigating the Impact of Model Misspecification in Neural
Simulation-based Inference [1.933681537640272]
We study the behaviour of neural SBI algorithms in the presence of various forms of model misspecification.
We find that misspecification can have a profoundly deleterious effect on performance.
We conclude that new approaches are required to address model misspecification if neural SBI algorithms are to be relied upon to derive accurate conclusions.
arXiv Detail & Related papers (2022-09-05T09:08:16Z) - The Unreasonable Effectiveness of Deep Evidential Regression [72.30888739450343]
A new approach with uncertainty-aware regression-based neural networks (NNs) shows promise over traditional deterministic methods and typical Bayesian NNs.
We detail the theoretical shortcomings and analyze the performance on synthetic and real-world data sets, showing that Deep Evidential Regression is a quantification rather than an exact uncertainty.
arXiv Detail & Related papers (2022-05-20T10:10:32Z) - Quantifying Epistemic Uncertainty in Deep Learning [15.494774321257939]
Uncertainty quantification is at the core of the reliability and robustness of machine learning.
We provide a theoretical framework to dissect the uncertainty, especially the textitepistemic component, in deep learning.
We propose two approaches to estimate these uncertainties, one based on influence function and one on variability.
arXiv Detail & Related papers (2021-10-23T03:21:10Z) - Dense Uncertainty Estimation [62.23555922631451]
In this paper, we investigate neural networks and uncertainty estimation techniques to achieve both accurate deterministic prediction and reliable uncertainty estimation.
We work on two types of uncertainty estimations solutions, namely ensemble based methods and generative model based methods, and explain their pros and cons while using them in fully/semi/weakly-supervised framework.
arXiv Detail & Related papers (2021-10-13T01:23:48Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - The Hidden Uncertainty in a Neural Networks Activations [105.4223982696279]
The distribution of a neural network's latent representations has been successfully used to detect out-of-distribution (OOD) data.
This work investigates whether this distribution correlates with a model's epistemic uncertainty, thus indicating its ability to generalise to novel inputs.
arXiv Detail & Related papers (2020-12-05T17:30:35Z) - Learning Mixtures of Low-Rank Models [89.39877968115833]
We study the problem of learning computational mixtures of low-rank models.
We develop an algorithm that is guaranteed to recover the unknown matrices with near-optimal sample.
In addition, the proposed algorithm is provably stable against random noise.
arXiv Detail & Related papers (2020-09-23T17:53:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.