Multi-Class Uncertainty Calibration via Mutual Information
Maximization-based Binning
- URL: http://arxiv.org/abs/2006.13092v6
- Date: Mon, 8 Mar 2021 08:39:50 GMT
- Title: Multi-Class Uncertainty Calibration via Mutual Information
Maximization-based Binning
- Authors: Kanil Patel, William Beluch, Bin Yang, Michael Pfeiffer and Dan Zhang
- Abstract summary: Post-hoc multi-class calibration is a common approach for providing confidence estimates of deep neural network predictions.
Recent work has shown that widely used scaling methods underestimate their calibration error.
We propose a shared class-wise (sCW) calibration strategy, sharing one calibrator among similar classes.
- Score: 8.780958735684958
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Post-hoc multi-class calibration is a common approach for providing
high-quality confidence estimates of deep neural network predictions. Recent
work has shown that widely used scaling methods underestimate their calibration
error, while alternative Histogram Binning (HB) methods often fail to preserve
classification accuracy. When classes have small prior probabilities, HB also
faces the issue of severe sample-inefficiency after the conversion into K
one-vs-rest class-wise calibration problems. The goal of this paper is to
resolve the identified issues of HB in order to provide calibrated confidence
estimates using only a small holdout calibration dataset for bin optimization
while preserving multi-class ranking accuracy. From an information-theoretic
perspective, we derive the I-Max concept for binning, which maximizes the
mutual information between labels and quantized logits. This concept mitigates
potential loss in ranking performance due to lossy quantization, and by
disentangling the optimization of bin edges and representatives allows
simultaneous improvement of ranking and calibration performance. To improve the
sample efficiency and estimates from a small calibration set, we propose a
shared class-wise (sCW) calibration strategy, sharing one calibrator among
similar classes (e.g., with similar class priors) so that the training sets of
their class-wise calibration problems can be merged to train the single
calibrator. The combination of sCW and I-Max binning outperforms the state of
the art calibration methods on various evaluation metrics across different
benchmark datasets and models, using a small calibration set (e.g., 1k samples
for ImageNet).
Related papers
- Confidence Calibration of Classifiers with Many Classes [5.018156030818883]
For classification models based on neural networks, the maximum predicted class probability is often used as a confidence score.
This score rarely predicts well the probability of making a correct prediction and requires a post-processing calibration step.
arXiv Detail & Related papers (2024-11-05T10:51:01Z) - Calibration by Distribution Matching: Trainable Kernel Calibration
Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression.
These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization.
We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z) - Calibration of Neural Networks [77.34726150561087]
This paper presents a survey of confidence calibration problems in the context of neural networks.
We analyze problem statement, calibration definitions, and different approaches to evaluation.
Empirical experiments cover various datasets and models, comparing calibration methods according to different criteria.
arXiv Detail & Related papers (2023-03-19T20:27:51Z) - On Calibrating Semantic Segmentation Models: Analyses and An Algorithm [51.85289816613351]
We study the problem of semantic segmentation calibration.
Model capacity, crop size, multi-scale testing, and prediction correctness have impact on calibration.
We propose a simple, unifying, and effective approach, namely selective scaling.
arXiv Detail & Related papers (2022-12-22T22:05:16Z) - Class-wise and reduced calibration methods [0.0]
We show how a reduced calibration method transforms the original problem into a simpler one.
Second, we propose class-wise calibration methods, based on building on a phenomenon called neural collapse.
Applying the two methods together results in class-wise reduced calibration algorithms, which are powerful tools for reducing the prediction and per-class calibration errors.
arXiv Detail & Related papers (2022-10-07T17:13:17Z) - MBCT: Tree-Based Feature-Aware Binning for Individual Uncertainty
Calibration [29.780204566046503]
We propose a feature-aware binning framework, called Multiple Boosting Trees (MBCT)
Our MBCT is non-monotonic, and has the potential to improve order accuracy, due to its learnable binning scheme and the individual calibration.
Results show that our method outperforms all competing models in terms of both calibration error and order accuracy.
arXiv Detail & Related papers (2022-02-09T08:59:16Z) - Top-label calibration [3.3504365823045044]
We study the problem of post-hoc calibration for multiclass classification, with an emphasis on histogram binning.
We find that the popular notion of confidence calibration is not sufficiently strong -- there exist predictors that are not calibrated in any meaningful way but are perfectly confidence calibrated.
We propose a closely related (but subtly different) notion, top-label calibration, that accurately captures the intuition and simplicity of confidence calibration, but addresses its drawbacks.
arXiv Detail & Related papers (2021-07-18T03:27:50Z) - Localized Calibration: Metrics and Recalibration [133.07044916594361]
We propose a fine-grained calibration metric that spans the gap between fully global and fully individualized calibration.
We then introduce a localized recalibration method, LoRe, that improves the LCE better than existing recalibration methods.
arXiv Detail & Related papers (2021-02-22T07:22:12Z) - Unsupervised Calibration under Covariate Shift [92.02278658443166]
We introduce the problem of calibration under domain shift and propose an importance sampling based approach to address it.
We evaluate and discuss the efficacy of our method on both real-world datasets and synthetic datasets.
arXiv Detail & Related papers (2020-06-29T21:50:07Z) - Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions.
We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test.
Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z) - Mix-n-Match: Ensemble and Compositional Methods for Uncertainty
Calibration in Deep Learning [21.08664370117846]
We show how Mix-n-Match calibration strategies can help achieve remarkably better data-efficiency and expressive power.
We also reveal potential issues in standard evaluation practices.
Our approaches outperform state-of-the-art solutions on both the calibration as well as the evaluation tasks.
arXiv Detail & Related papers (2020-03-16T17:00:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.