Related papers: Multi-Class Uncertainty Calibration via Mutual Information Maximization-based Binning

Multi-Class Uncertainty Calibration via Mutual Information Maximization-based Binning

URL: http://arxiv.org/abs/2006.13092v6
Date: Mon, 8 Mar 2021 08:39:50 GMT
Title: Multi-Class Uncertainty Calibration via Mutual Information Maximization-based Binning
Authors: Kanil Patel, William Beluch, Bin Yang, Michael Pfeiffer and Dan Zhang
Abstract summary: Post-hoc multi-class calibration is a common approach for providing confidence estimates of deep neural network predictions. Recent work has shown that widely used scaling methods underestimate their calibration error. We propose a shared class-wise (sCW) calibration strategy, sharing one calibrator among similar classes.
Score: 8.780958735684958
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Post-hoc multi-class calibration is a common approach for providing high-quality confidence estimates of deep neural network predictions. Recent work has shown that widely used scaling methods underestimate their calibration error, while alternative Histogram Binning (HB) methods often fail to preserve classification accuracy. When classes have small prior probabilities, HB also faces the issue of severe sample-inefficiency after the conversion into K one-vs-rest class-wise calibration problems. The goal of this paper is to resolve the identified issues of HB in order to provide calibrated confidence estimates using only a small holdout calibration dataset for bin optimization while preserving multi-class ranking accuracy. From an information-theoretic perspective, we derive the I-Max concept for binning, which maximizes the mutual information between labels and quantized logits. This concept mitigates potential loss in ranking performance due to lossy quantization, and by disentangling the optimization of bin edges and representatives allows simultaneous improvement of ranking and calibration performance. To improve the sample efficiency and estimates from a small calibration set, we propose a shared class-wise (sCW) calibration strategy, sharing one calibrator among similar classes (e.g., with similar class priors) so that the training sets of their class-wise calibration problems can be merged to train the single calibrator. The combination of sCW and I-Max binning outperforms the state of the art calibration methods on various evaluation metrics across different benchmark datasets and models, using a small calibration set (e.g., 1k samples for ImageNet).

Related papers

Rethinking Early Stopping: Refine, Then Calibrate [49.966899634962374]
We present a novel variational formulation of the calibration-refinement decomposition.<n>We provide theoretical and empirical evidence that calibration and refinement errors are not minimized simultaneously during training.
arXiv Detail & Related papers (2025-01-31T15:03:54Z)
Classifier Ensemble for Efficient Uncertainty Calibration of Deep Neural Networks for Image Classification [1.0649605625763086]
We evaluate both accuracy and calibration metrics, focusing on Expected Error (ECE) and Maximum Error (MCE) Our work compares different methods for building simple yet efficient classifier ensembles, including majority voting and several metamodel-based approaches.
arXiv Detail & Related papers (2025-01-17T10:16:18Z)
Confidence Calibration of Classifiers with Many Classes [5.018156030818883]
For classification models based on neural networks, the maximum predicted class probability is often used as a confidence score. This score rarely predicts well the probability of making a correct prediction and requires a post-processing calibration step.
arXiv Detail & Related papers (2024-11-05T10:51:01Z)
Calibration by Distribution Matching: Trainable Kernel Calibration Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression. These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization. We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z)
Calibration of Neural Networks [77.34726150561087]
This paper presents a survey of confidence calibration problems in the context of neural networks. We analyze problem statement, calibration definitions, and different approaches to evaluation. Empirical experiments cover various datasets and models, comparing calibration methods according to different criteria.
arXiv Detail & Related papers (2023-03-19T20:27:51Z)
On Calibrating Semantic Segmentation Models: Analyses and An Algorithm [51.85289816613351]
We study the problem of semantic segmentation calibration. Model capacity, crop size, multi-scale testing, and prediction correctness have impact on calibration. We propose a simple, unifying, and effective approach, namely selective scaling.
arXiv Detail & Related papers (2022-12-22T22:05:16Z)
Class-wise and reduced calibration methods [0.0]
We show how a reduced calibration method transforms the original problem into a simpler one. Second, we propose class-wise calibration methods, based on building on a phenomenon called neural collapse. Applying the two methods together results in class-wise reduced calibration algorithms, which are powerful tools for reducing the prediction and per-class calibration errors.
arXiv Detail & Related papers (2022-10-07T17:13:17Z)
MBCT: Tree-Based Feature-Aware Binning for Individual Uncertainty Calibration [29.780204566046503]
We propose a feature-aware binning framework, called Multiple Boosting Trees (MBCT) Our MBCT is non-monotonic, and has the potential to improve order accuracy, due to its learnable binning scheme and the individual calibration. Results show that our method outperforms all competing models in terms of both calibration error and order accuracy.
arXiv Detail & Related papers (2022-02-09T08:59:16Z)
Top-label calibration [3.3504365823045044]
We study the problem of post-hoc calibration for multiclass classification, with an emphasis on histogram binning. We find that the popular notion of confidence calibration is not sufficiently strong -- there exist predictors that are not calibrated in any meaningful way but are perfectly confidence calibrated. We propose a closely related (but subtly different) notion, top-label calibration, that accurately captures the intuition and simplicity of confidence calibration, but addresses its drawbacks.
arXiv Detail & Related papers (2021-07-18T03:27:50Z)
Localized Calibration: Metrics and Recalibration [133.07044916594361]
We propose a fine-grained calibration metric that spans the gap between fully global and fully individualized calibration. We then introduce a localized recalibration method, LoRe, that improves the LCE better than existing recalibration methods.
arXiv Detail & Related papers (2021-02-22T07:22:12Z)
Unsupervised Calibration under Covariate Shift [92.02278658443166]
We introduce the problem of calibration under domain shift and propose an importance sampling based approach to address it. We evaluate and discuss the efficacy of our method on both real-world datasets and synthetic datasets.
arXiv Detail & Related papers (2020-06-29T21:50:07Z)
Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions. We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test. Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z)
Mix-n-Match: Ensemble and Compositional Methods for Uncertainty Calibration in Deep Learning [21.08664370117846]
We show how Mix-n-Match calibration strategies can help achieve remarkably better data-efficiency and expressive power. We also reveal potential issues in standard evaluation practices. Our approaches outperform state-of-the-art solutions on both the calibration as well as the evaluation tasks.
arXiv Detail & Related papers (2020-03-16T17:00:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.