Related papers: Uncertainty-Aware Post-Hoc Calibration: Mitigating Confidently Incorrect Predictions Beyond Calibration Metrics

Uncertainty-Aware Post-Hoc Calibration: Mitigating Confidently Incorrect Predictions Beyond Calibration Metrics

URL: http://arxiv.org/abs/2510.17915v1
Date: Sun, 19 Oct 2025 23:55:36 GMT
Title: Uncertainty-Aware Post-Hoc Calibration: Mitigating Confidently Incorrect Predictions Beyond Calibration Metrics
Authors: Hassan Gharoun, Mohammad Sadegh Khorshidi, Kasra Ranjbarigderi, Fang Chen, Amir H. Gandomi,
Abstract summary: This paper presents a post-hoc calibration framework to enhance calibration quality and uncertainty-aware decision-making.<n>A comprehensive evaluation is conducted using calibration metrics, uncertainty-aware performance measures, and empirical conformal coverage.<n> Experiments show that the proposed method achieves lower confidently incorrect predictions, and competitive Expected Error compared with isotonic and focal-loss baselines.
Score: 6.9681910774977815
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite extensive research on neural network calibration, existing methods typically apply global transformations that treat all predictions uniformly, overlooking the heterogeneous reliability of individual predictions. Furthermore, the relationship between improved calibration and effective uncertainty-aware decision-making remains largely unexplored. This paper presents a post-hoc calibration framework that leverages prediction reliability assessment to jointly enhance calibration quality and uncertainty-aware decision-making. The framework employs proximity-based conformal prediction to stratify calibration samples into putatively correct and putatively incorrect groups based on semantic similarity in feature space. A dual calibration strategy is then applied: standard isotonic regression calibrated confidence in putatively correct predictions, while underconfidence-regularized isotonic regression reduces confidence toward uniform distributions for putatively incorrect predictions, facilitating their identification for further investigations. A comprehensive evaluation is conducted using calibration metrics, uncertainty-aware performance measures, and empirical conformal coverage. Experiments on CIFAR-10 and CIFAR-100 with BiT and CoAtNet backbones show that the proposed method achieves lower confidently incorrect predictions, and competitive Expected Calibration Error compared with isotonic and focal-loss baselines. This work bridges calibration and uncertainty quantification through instance-level adaptivity, offering a practical post-hoc solution that requires no model retraining while improving both probability alignment and uncertainty-aware decision-making.

Related papers

Calibrating Uncertainty for Zero-Shot Adversarial CLIP [33.707647228637114]
We propose a novel adversarial fine-tuning objective for CLIP considering both prediction accuracy and uncertainty alignments.<n>Our objective aligns these distributions holistically under perturbations, moving beyond single-logit anchoring and restoring uncertainty.
arXiv Detail & Related papers (2025-12-15T05:41:08Z)
Geometric Calibration and Neutral Zones for Uncertainty-Aware Multi-Class Classification [0.0]
This work bridges information geometry and statistical learning, offering formal guarantees for uncertainty-aware classification in applications requiring rigorous validation.<n> Empirical validation on Adeno-Associated Virus classification demonstrates that the two-stage framework captures 72.5% of errors while deferring 34.5% of samples, reducing automated decision error rates from 16.8% to 6.9%.
arXiv Detail & Related papers (2025-11-26T01:29:49Z)
CLUE: Neural Networks Calibration via Learning Uncertainty-Error alignment [7.702016079410588]
We introduce CLUE (Calibration via Learning Uncertainty-Error Alignment), a novel approach that aligns predicted uncertainty with observed error during training.<n>We show that CLUE achieves superior calibration quality and competitive predictive performance with respect to state-of-the-art approaches.
arXiv Detail & Related papers (2025-05-28T19:23:47Z)
Provably Reliable Conformal Prediction Sets in the Presence of Data Poisoning [53.42244686183879]
Conformal prediction provides model-agnostic and distribution-free uncertainty quantification.<n>Yet, conformal prediction is not reliable under poisoning attacks where adversaries manipulate both training and calibration data.<n>We propose reliable prediction sets (RPS): the first efficient method for constructing conformal prediction sets with provable reliability guarantees under poisoning.
arXiv Detail & Related papers (2024-10-13T15:37:11Z)
Calibrated Probabilistic Forecasts for Arbitrary Sequences [58.54729945445505]
Real-world data streams can change unpredictably due to distribution shifts, feedback loops and adversarial actors.<n>We present a forecasting framework ensuring valid uncertainty estimates regardless of how data evolves.
arXiv Detail & Related papers (2024-09-27T21:46:42Z)
A Confidence Interval for the $\ell_2$ Expected Calibration Error [35.88784957918326]
We develop confidence intervals $ell$ Expected the Error (ECE)<n>We consider top-1-to-$k$ calibration, which includes both the popular notion of confidence calibration as well as calibration.<n>For a debiased estimator of the ECE, we show normality, but with different convergence rates and variances for calibrated and misd models.
arXiv Detail & Related papers (2024-08-16T20:00:08Z)
Towards Certification of Uncertainty Calibration under Adversarial Attacks [96.48317453951418]
We show that attacks can significantly harm calibration, and thus propose certified calibration as worst-case bounds on calibration under adversarial perturbations.<n>We propose novel calibration attacks and demonstrate how they can improve model calibration through textitadversarial calibration training
arXiv Detail & Related papers (2024-05-22T18:52:09Z)
Calibration by Distribution Matching: Trainable Kernel Calibration Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression. These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization. We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z)
Calibration of Neural Networks [77.34726150561087]
This paper presents a survey of confidence calibration problems in the context of neural networks. We analyze problem statement, calibration definitions, and different approaches to evaluation. Empirical experiments cover various datasets and models, comparing calibration methods according to different criteria.
arXiv Detail & Related papers (2023-03-19T20:27:51Z)
Unsupervised Calibration under Covariate Shift [92.02278658443166]
We introduce the problem of calibration under domain shift and propose an importance sampling based approach to address it. We evaluate and discuss the efficacy of our method on both real-world datasets and synthetic datasets.
arXiv Detail & Related papers (2020-06-29T21:50:07Z)
CRUDE: Calibrating Regression Uncertainty Distributions Empirically [4.552831400384914]
Calibrated uncertainty estimates in machine learning are crucial to many fields such as autonomous vehicles, medicine, and weather and climate forecasting. We present a calibration method for regression settings that does not assume a particular uncertainty distribution over the error: Calibrating Regression Uncertainty Distributions Empirically (CRUDE) CRUDE demonstrates consistently sharper, better calibrated, and more accurate uncertainty estimates than state-of-the-art techniques.
arXiv Detail & Related papers (2020-05-26T03:08:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.