On Temperature Scaling and Conformal Prediction of Deep Classifiers
- URL: http://arxiv.org/abs/2402.05806v2
- Date: Thu, 4 Jul 2024 08:59:21 GMT
- Title: On Temperature Scaling and Conformal Prediction of Deep Classifiers
- Authors: Lahav Dabah, Tom Tirer,
- Abstract summary: Two popular approaches for that aim are: 1): modifies the classifier's softmax values such that the maximal value better estimates the correctness probability; and 2) Conformal Prediction (CP): produces a prediction set of candidate labels that contains the true label with a user-specified probability.
In practice, both types of indications are desirable, yet, so far the interplay between them has not been investigated.
- Score: 9.975341265604577
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many classification applications, the prediction of a deep neural network (DNN) based classifier needs to be accompanied by some confidence indication. Two popular approaches for that aim are: 1) Calibration: modifies the classifier's softmax values such that the maximal value better estimates the correctness probability; and 2) Conformal Prediction (CP): produces a prediction set of candidate labels that contains the true label with a user-specified probability, guaranteeing marginal coverage, rather than, e.g., per class coverage. In practice, both types of indications are desirable, yet, so far the interplay between them has not been investigated. We start this paper with an extensive empirical study of the effect of the popular Temperature Scaling (TS) calibration on prominent CP methods and reveal that while it improves the class-conditional coverage of adaptive CP methods, surprisingly, it negatively affects their prediction set sizes. Subsequently, we explore the effect of TS beyond its calibration application and offer simple guidelines for practitioners to trade prediction set size and conditional coverage of adaptive CP methods while effectively combining them with calibration. Finally, we present a theoretical analysis of the effect of TS on the prediction set sizes, revealing several mathematical properties of the procedure, according to which we provide reasoning for this unintuitive phenomenon.
Related papers
- Domain-adaptive and Subgroup-specific Cascaded Temperature Regression
for Out-of-distribution Calibration [16.930766717110053]
We propose a novel meta-set-based cascaded temperature regression method for post-hoc calibration.
We partition each meta-set into subgroups based on predicted category and confidence level, capturing diverse uncertainties.
A regression network is then trained to derive category-specific and confidence-level-specific scaling, achieving calibration across meta-sets.
arXiv Detail & Related papers (2024-02-14T14:35:57Z) - Delving into temperature scaling for adaptive conformal prediction [10.340903334800787]
Conformal prediction, as an emerging uncertainty qualification technique, constructs prediction sets that are guaranteed to contain the true label with pre-defined probability.
We show that current confidence calibration methods (e.g., temperature scaling) normally lead to larger prediction sets in adaptive conformal prediction.
We propose $Conformal$ $Temperature$ $Scaling$ (ConfTS), a variant of temperature scaling that aims to improve the efficiency of adaptive conformal prediction.
arXiv Detail & Related papers (2024-02-06T19:27:48Z) - RR-CP: Reliable-Region-Based Conformal Prediction for Trustworthy
Medical Image Classification [24.52922162675259]
Conformal prediction (CP) generates a set of predictions for a given test sample.
The size of the set indicates how certain the predictions are.
We propose a new method called Reliable-Region-Based Conformal Prediction (RR-CP)
arXiv Detail & Related papers (2023-09-09T11:14:04Z) - Model Calibration in Dense Classification with Adaptive Label
Perturbation [44.62722402349157]
Existing dense binary classification models are prone to being over-confident.
We propose Adaptive Label Perturbation (ASLP) which learns a unique label perturbation level for each training image.
ASLP can significantly improve calibration degrees of dense binary classification models on both in-distribution and out-of-distribution data.
arXiv Detail & Related papers (2023-07-25T14:40:11Z) - When Does Confidence-Based Cascade Deferral Suffice? [69.28314307469381]
Cascades are a classical strategy to enable inference cost to vary adaptively across samples.
A deferral rule determines whether to invoke the next classifier in the sequence, or to terminate prediction.
Despite being oblivious to the structure of the cascade, confidence-based deferral often works remarkably well in practice.
arXiv Detail & Related papers (2023-07-06T04:13:57Z) - Improving Uncertainty Quantification of Deep Classifiers via
Neighborhood Conformal Prediction: Novel Algorithm and Theoretical Analysis [30.0231328500976]
Conformal prediction (CP) is a principled framework for uncertainty quantification of deep models.
This paper proposes a novel algorithm referred to as Neighborhood Conformal Prediction (NCP) to improve the efficiency of uncertainty quantification.
We show that NCP leads to significant reduction in prediction set size over prior CP methods.
arXiv Detail & Related papers (2023-03-19T15:56:50Z) - Sample-dependent Adaptive Temperature Scaling for Improved Calibration [95.7477042886242]
Post-hoc approach to compensate for neural networks being wrong is to perform temperature scaling.
We propose to predict a different temperature value for each input, allowing us to adjust the mismatch between confidence and accuracy.
We test our method on the ResNet50 and WideResNet28-10 architectures using the CIFAR10/100 and Tiny-ImageNet datasets.
arXiv Detail & Related papers (2022-07-13T14:13:49Z) - T-Cal: An optimal test for the calibration of predictive models [49.11538724574202]
We consider detecting mis-calibration of predictive models using a finite validation dataset as a hypothesis testing problem.
detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions.
We propose T-Cal, a minimax test for calibration based on a de-biased plug-in estimator of the $ell$-Expected Error (ECE)
arXiv Detail & Related papers (2022-03-03T16:58:54Z) - Efficient and Differentiable Conformal Prediction with General Function
Classes [96.74055810115456]
We propose a generalization of conformal prediction to multiple learnable parameters.
We show that it achieves approximate valid population coverage and near-optimal efficiency within class.
Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly.
arXiv Detail & Related papers (2022-02-22T18:37:23Z) - Taming Overconfident Prediction on Unlabeled Data from Hindsight [50.9088560433925]
Minimizing prediction uncertainty on unlabeled data is a key factor to achieve good performance in semi-supervised learning.
This paper proposes a dual mechanism, named ADaptive Sharpening (ADS), which first applies a soft-threshold to adaptively mask out determinate and negligible predictions.
ADS significantly improves the state-of-the-art SSL methods by making it a plug-in.
arXiv Detail & Related papers (2021-12-15T15:17:02Z) - Distribution-free uncertainty quantification for classification under
label shift [105.27463615756733]
We focus on uncertainty quantification (UQ) for classification problems via two avenues.
We first argue that label shift hurts UQ, by showing degradation in coverage and calibration.
We examine these techniques theoretically in a distribution-free framework and demonstrate their excellent practical performance.
arXiv Detail & Related papers (2021-03-04T20:51:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.