Post-hoc Uncertainty Calibration for Domain Drift Scenarios
- URL: http://arxiv.org/abs/2012.10988v1
- Date: Sun, 20 Dec 2020 18:21:13 GMT
- Title: Post-hoc Uncertainty Calibration for Domain Drift Scenarios
- Authors: Christian Tomani, Sebastian Gruber, Muhammed Ebrar Erdem, Daniel
Cremers, Florian Buettner
- Abstract summary: We show that existing post-hoc calibration methods yield highly over-confident predictions under domain shift.
We introduce a simple strategy where perturbations are applied to samples in the validation set before performing the post-hoc calibration step.
- Score: 46.88826364244423
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We address the problem of uncertainty calibration. While standard deep neural
networks typically yield uncalibrated predictions, calibrated confidence scores
that are representative of the true likelihood of a prediction can be achieved
using post-hoc calibration methods. However, to date the focus of these
approaches has been on in-domain calibration. Our contribution is two-fold.
First, we show that existing post-hoc calibration methods yield highly
over-confident predictions under domain shift. Second, we introduce a simple
strategy where perturbations are applied to samples in the validation set
before performing the post-hoc calibration step. In extensive experiments, we
demonstrate that this perturbation step results in substantially better
calibration under domain shift on a wide range of architectures and modelling
tasks.
Related papers
- Feature Clipping for Uncertainty Calibration [24.465567005078135]
Modern deep neural networks (DNNs) often suffer from overconfidence, leading to miscalibration.
We propose a novel post-hoc calibration method called feature clipping (FC) to address this issue.
FC involves clipping feature values to a specified threshold, effectively increasing entropy in high calibration error samples.
arXiv Detail & Related papers (2024-10-16T06:44:35Z) - Towards Certification of Uncertainty Calibration under Adversarial Attacks [96.48317453951418]
We show that attacks can significantly harm calibration, and thus propose certified calibration as worst-case bounds on calibration under adversarial perturbations.
We propose novel calibration attacks and demonstrate how they can improve model calibration through textitadversarial calibration training
arXiv Detail & Related papers (2024-05-22T18:52:09Z) - Multiclass Confidence and Localization Calibration for Object Detection [4.119048608751183]
Deep neural networks (DNNs) tend to make overconfident predictions, rendering them poorly calibrated.
We propose a new train-time technique for calibrating modern object detection methods.
arXiv Detail & Related papers (2023-06-14T06:14:16Z) - Sharp Calibrated Gaussian Processes [58.94710279601622]
State-of-the-art approaches for designing calibrated models rely on inflating the Gaussian process posterior variance.
We present a calibration approach that generates predictive quantiles using a computation inspired by the vanilla Gaussian process posterior variance.
Our approach is shown to yield a calibrated model under reasonable assumptions.
arXiv Detail & Related papers (2023-02-23T12:17:36Z) - On Calibrating Semantic Segmentation Models: Analyses and An Algorithm [51.85289816613351]
We study the problem of semantic segmentation calibration.
Model capacity, crop size, multi-scale testing, and prediction correctness have impact on calibration.
We propose a simple, unifying, and effective approach, namely selective scaling.
arXiv Detail & Related papers (2022-12-22T22:05:16Z) - On Calibration of Scene-Text Recognition Models [16.181357648680365]
We analyze several recent STR methods and show that they are consistently overconfident.
We demonstrate that for attention based decoders, calibration of individual character predictions increases word-level calibration error.
arXiv Detail & Related papers (2020-12-23T13:25:25Z) - Unsupervised Calibration under Covariate Shift [92.02278658443166]
We introduce the problem of calibration under domain shift and propose an importance sampling based approach to address it.
We evaluate and discuss the efficacy of our method on both real-world datasets and synthetic datasets.
arXiv Detail & Related papers (2020-06-29T21:50:07Z) - Calibration of Pre-trained Transformers [55.57083429195445]
We focus on BERT and RoBERTa in this work, and analyze their calibration across three tasks: natural language inference, paraphrase detection, and commonsense reasoning.
We show that: (1) when used out-of-the-box, pre-trained models are calibrated in-domain, and compared to baselines, their calibration error out-of-domain can be as much as 3.5x lower; (2) temperature scaling is effective at further reducing calibration error in-domain, and using label smoothing to deliberately increase empirical uncertainty helps calibrate posteriors out-of-domain.
arXiv Detail & Related papers (2020-03-17T18:58:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.