Uncertainty Quantification for Machine Learning: One Size Does Not Fit All
- URL: http://arxiv.org/abs/2512.12341v1
- Date: Sat, 13 Dec 2025 14:15:04 GMT
- Title: Uncertainty Quantification for Machine Learning: One Size Does Not Fit All
- Authors: Paul Hofman, Yusuf Sale, Eyke Hüllermeier,
- Abstract summary: We argue that uncertainty quantification should be tailored to the specific application.<n>In particular, we show that, for the task of selective prediction, the scoring rule should ideally match the task loss.<n>In an active learning setting, epistemic uncertainty based on zero-one loss is shown to consistently outperform other uncertainty measures.
- Score: 27.02918627964384
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Proper quantification of predictive uncertainty is essential for the use of machine learning in safety-critical applications. Various uncertainty measures have been proposed for this purpose, typically claiming superiority over other measures. In this paper, we argue that there is no single best measure. Instead, uncertainty quantification should be tailored to the specific application. To this end, we use a flexible family of uncertainty measures that distinguishes between total, aleatoric, and epistemic uncertainty of second-order distributions. These measures can be instantiated with specific loss functions, so-called proper scoring rules, to control their characteristics, and we show that different characteristics are useful for different tasks. In particular, we show that, for the task of selective prediction, the scoring rule should ideally match the task loss. On the other hand, for out-of-distribution detection, our results confirm that mutual information, a widely used measure of epistemic uncertainty, performs best. Furthermore, in an active learning setting, epistemic uncertainty based on zero-one loss is shown to consistently outperform other uncertainty measures.
Related papers
- Uncertainty Quantification with Proper Scoring Rules: Adjusting Measures to Prediction Tasks [19.221081896134567]
We propose measures of uncertainty based on a known decomposition of (strictly) proper scoring rules, a specific type of loss function, into a divergence and an entropy component.<n>This leads to a flexible framework for uncertainty quantification that can be instantiated with different losses (scoring rules)<n>We show that this flexibility is indeed advantageous. In particular, we analyze the task of selective prediction and show that the scoring rule should ideally match the task loss.
arXiv Detail & Related papers (2025-05-28T16:22:53Z) - SConU: Selective Conformal Uncertainty in Large Language Models [59.25881667640868]
We propose a novel approach termed Selective Conformal Uncertainty (SConU)<n>We develop two conformal p-values that are instrumental in determining whether a given sample deviates from the uncertainty distribution of the calibration set at a specific manageable risk level.<n>Our approach not only facilitates rigorous management of miscoverage rates across both single-domain and interdisciplinary contexts, but also enhances the efficiency of predictions.
arXiv Detail & Related papers (2025-04-19T03:01:45Z) - Probabilistic Modeling of Disparity Uncertainty for Robust and Efficient Stereo Matching [61.73532883992135]
We propose a new uncertainty-aware stereo matching framework.<n>We adopt Bayes risk as the measurement of uncertainty and use it to separately estimate data and model uncertainty.
arXiv Detail & Related papers (2024-12-24T23:28:20Z) - On Information-Theoretic Measures of Predictive Uncertainty [5.8034373350518775]
Despite its significance, there is no universal agreement on how to best quantify predictive uncertainty.<n>Our proposed framework categorizes predictive uncertainty measures according to two factors: (I) The predicting model (II) The approximation of the true predictive distribution.<n>We extensively evaluate these measures across a broad set of tasks, identifying conditions under which certain measures excel.
arXiv Detail & Related papers (2024-10-14T17:52:18Z) - Benchmarking Uncertainty Disentanglement: Specialized Uncertainties for Specialized Tasks [17.00971204252757]
We reimplement and evaluate a comprehensive range of uncertainty estimators on ImageNet.<n>We find that, despite recent theoretical endeavors, no existing approach provides pairs of disentangled uncertainty estimators in practice.<n>Our results provide both practical advice for which uncertainty estimators to use for which specific task, and reveal opportunities for future research toward task-centric and disentangled uncertainties.
arXiv Detail & Related papers (2024-02-29T18:52:56Z) - From Risk to Uncertainty: Generating Predictive Uncertainty Measures via Bayesian Estimation [5.355925496689674]
We build a framework that allows one to generate different predictive uncertainty measures.<n>We validate our method on image datasets by evaluating its performance in detecting out-of-distribution and misclassified instances.
arXiv Detail & Related papers (2024-02-16T14:40:22Z) - Second-Order Uncertainty Quantification: Variance-Based Measures [2.3999111269325266]
This paper proposes a novel way to use variance-based measures to quantify uncertainty on the basis of second-order distributions in classification problems.
A distinctive feature of the measures is the ability to reason about uncertainties on a class-based level, which is useful in situations where nuanced decision-making is required.
arXiv Detail & Related papers (2023-12-30T16:30:52Z) - Bayesian autoencoders with uncertainty quantification: Towards
trustworthy anomaly detection [78.24964622317634]
In this work, the formulation of Bayesian autoencoders (BAEs) is adopted to quantify the total anomaly uncertainty.
To evaluate the quality of uncertainty, we consider the task of classifying anomalies with the additional option of rejecting predictions of high uncertainty.
Our experiments demonstrate the effectiveness of the BAE and total anomaly uncertainty on a set of benchmark datasets and two real datasets for manufacturing.
arXiv Detail & Related papers (2022-02-25T12:20:04Z) - A Cautionary Tale of Decorrelating Theory Uncertainties [0.5076419064097732]
We will discuss techniques to train machine learning classifiers that are independent of a given feature.
We will examine theory uncertainties, which typically do not have a statistical origin.
We will provide explicit examples of two-point (fragmentation modeling) and continuous (higher-order corrections) uncertainties where decorrelating significantly reduces the apparent uncertainty.
arXiv Detail & Related papers (2021-09-16T18:00:01Z) - Learning Uncertainty For Safety-Oriented Semantic Segmentation In
Autonomous Driving [77.39239190539871]
We show how uncertainty estimation can be leveraged to enable safety critical image segmentation in autonomous driving.
We introduce a new uncertainty measure based on disagreeing predictions as measured by a dissimilarity function.
We show experimentally that our proposed approach is much less computationally intensive at inference time than competing methods.
arXiv Detail & Related papers (2021-05-28T09:23:05Z) - DEUP: Direct Epistemic Uncertainty Prediction [56.087230230128185]
Epistemic uncertainty is part of out-of-sample prediction error due to the lack of knowledge of the learner.
We propose a principled approach for directly estimating epistemic uncertainty by learning to predict generalization error and subtracting an estimate of aleatoric uncertainty.
arXiv Detail & Related papers (2021-02-16T23:50:35Z) - Learning to Predict Error for MRI Reconstruction [67.76632988696943]
We demonstrate that predictive uncertainty estimated by the current methods does not highly correlate with prediction error.
We propose a novel method that estimates the target labels and magnitude of the prediction error in two steps.
arXiv Detail & Related papers (2020-02-13T15:55:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.