Related papers: Class-wise and reduced calibration methods

Class-wise and reduced calibration methods

URL: http://arxiv.org/abs/2210.03702v1
Date: Fri, 7 Oct 2022 17:13:17 GMT
Title: Class-wise and reduced calibration methods
Authors: Michael Panchenko, Anes Benmerzoug, Miguel de Benito Delgado
Abstract summary: We show how a reduced calibration method transforms the original problem into a simpler one. Second, we propose class-wise calibration methods, based on building on a phenomenon called neural collapse. Applying the two methods together results in class-wise reduced calibration algorithms, which are powerful tools for reducing the prediction and per-class calibration errors.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: For many applications of probabilistic classifiers it is important that the predicted confidence vectors reflect true probabilities (one says that the classifier is calibrated). It has been shown that common models fail to satisfy this property, making reliable methods for measuring and improving calibration important tools. Unfortunately, obtaining these is far from trivial for problems with many classes. We propose two techniques that can be used in tandem. First, a reduced calibration method transforms the original problem into a simpler one. We prove for several notions of calibration that solving the reduced problem minimizes the corresponding notion of miscalibration in the full problem, allowing the use of non-parametric recalibration methods that fail in higher dimensions. Second, we propose class-wise calibration methods, based on intuition building on a phenomenon called neural collapse and the observation that most of the accurate classifiers found in practice can be thought of as a union of K different functions which can be recalibrated separately, one for each class. These typically out-perform their non class-wise counterparts, especially for classifiers trained on imbalanced data sets. Applying the two methods together results in class-wise reduced calibration algorithms, which are powerful tools for reducing the prediction and per-class calibration errors. We demonstrate our methods on real and synthetic datasets and release all code as open source at https://github.com/appliedAI-Initiative

Related papers

Rethinking Early Stopping: Refine, Then Calibrate [49.966899634962374]
We show that calibration error and refinement error are not minimized simultaneously during training. We introduce a new metric for early stopping and hyper parameter tuning that makes it possible to minimize refinement error during training. Our method integrates seamlessly with any architecture and consistently improves performance across diverse classification tasks.
arXiv Detail & Related papers (2025-01-31T15:03:54Z)
Confidence Calibration of Classifiers with Many Classes [5.018156030818883]
For classification models based on neural networks, the maximum predicted class probability is often used as a confidence score. This score rarely predicts well the probability of making a correct prediction and requires a post-processing calibration step.
arXiv Detail & Related papers (2024-11-05T10:51:01Z)
Orthogonal Causal Calibration [55.28164682911196]
We develop general algorithms for reducing the task of causal calibration to that of calibrating a standard (non-causal) predictive model. Our results are exceedingly general, showing that essentially any existing calibration algorithm can be used in causal settings.
arXiv Detail & Related papers (2024-06-04T03:35:25Z)
Calibration by Distribution Matching: Trainable Kernel Calibration Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression. These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization. We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z)
Multi-Head Multi-Loss Model Calibration [13.841172927454204]
We introduce a form of simplified ensembling that bypasses the costly training and inference of deep ensembles. Specifically, each head is trained to minimize a weighted Cross-Entropy loss, but the weights are different among the different branches. We show that the resulting averaged predictions can achieve excellent calibration without sacrificing accuracy in two challenging datasets.
arXiv Detail & Related papers (2023-03-02T09:32:32Z)
On Calibrating Semantic Segmentation Models: Analyses and An Algorithm [51.85289816613351]
We study the problem of semantic segmentation calibration. Model capacity, crop size, multi-scale testing, and prediction correctness have impact on calibration. We propose a simple, unifying, and effective approach, namely selective scaling.
arXiv Detail & Related papers (2022-12-22T22:05:16Z)
Modular Conformal Calibration [80.33410096908872]
We introduce a versatile class of algorithms for recalibration in regression. This framework allows one to transform any regression model into a calibrated probabilistic model. We conduct an empirical study of MCC on 17 regression datasets.
arXiv Detail & Related papers (2022-06-23T03:25:23Z)
Top-label calibration [3.3504365823045044]
We study the problem of post-hoc calibration for multiclass classification, with an emphasis on histogram binning. We find that the popular notion of confidence calibration is not sufficiently strong -- there exist predictors that are not calibrated in any meaningful way but are perfectly confidence calibrated. We propose a closely related (but subtly different) notion, top-label calibration, that accurately captures the intuition and simplicity of confidence calibration, but addresses its drawbacks.
arXiv Detail & Related papers (2021-07-18T03:27:50Z)
Uncertainty Quantification and Deep Ensembles [79.4957965474334]
We show that deep-ensembles do not necessarily lead to improved calibration properties. We show that standard ensembling methods, when used in conjunction with modern techniques such as mixup regularization, can lead to less calibrated models. This text examines the interplay between three of the most simple and commonly used approaches to leverage deep learning when data is scarce.
arXiv Detail & Related papers (2020-07-17T07:32:24Z)
Unsupervised Calibration under Covariate Shift [92.02278658443166]
We introduce the problem of calibration under domain shift and propose an importance sampling based approach to address it. We evaluate and discuss the efficacy of our method on both real-world datasets and synthetic datasets.
arXiv Detail & Related papers (2020-06-29T21:50:07Z)
Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions. We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test. Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z)
Mix-n-Match: Ensemble and Compositional Methods for Uncertainty Calibration in Deep Learning [21.08664370117846]
We show how Mix-n-Match calibration strategies can help achieve remarkably better data-efficiency and expressive power. We also reveal potential issues in standard evaluation practices. Our approaches outperform state-of-the-art solutions on both the calibration as well as the evaluation tasks.
arXiv Detail & Related papers (2020-03-16T17:00:35Z)
Better Classifier Calibration for Small Data Sets [0.0]
We show how generating more data for calibration is able to improve calibration algorithm performance. The proposed approach adds computational cost but considering that the main use case is with small data sets this extra computational cost stays insignificant.
arXiv Detail & Related papers (2020-02-24T12:27:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.