A Consistent and Differentiable Lp Canonical Calibration Error Estimator
- URL: http://arxiv.org/abs/2210.07810v1
- Date: Thu, 13 Oct 2022 15:11:11 GMT
- Title: A Consistent and Differentiable Lp Canonical Calibration Error Estimator
- Authors: Teodora Popordanoska, Raphael Sayer, Matthew B. Blaschko
- Abstract summary: Deep neural networks are poorly calibrated and tend to output overconfident predictions.
We propose a low-bias, trainable calibration error estimator based on Dirichlet kernel density estimates.
Our method has a natural choice of kernel, and can be used to generate consistent estimates of other quantities.
- Score: 21.67616079217758
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Calibrated probabilistic classifiers are models whose predicted probabilities
can directly be interpreted as uncertainty estimates. It has been shown
recently that deep neural networks are poorly calibrated and tend to output
overconfident predictions. As a remedy, we propose a low-bias, trainable
calibration error estimator based on Dirichlet kernel density estimates, which
asymptotically converges to the true $L_p$ calibration error. This novel
estimator enables us to tackle the strongest notion of multiclass calibration,
called canonical (or distribution) calibration, while other common calibration
methods are tractable only for top-label and marginal calibration. The
computational complexity of our estimator is $\mathcal{O}(n^2)$, the
convergence rate is $\mathcal{O}(n^{-1/2})$, and it is unbiased up to
$\mathcal{O}(n^{-2})$, achieved by a geometric series debiasing scheme. In
practice, this means that the estimator can be applied to small subsets of
data, enabling efficient estimation and mini-batch updates. The proposed method
has a natural choice of kernel, and can be used to generate consistent
estimates of other quantities based on conditional expectation, such as the
sharpness of a probabilistic classifier. Empirical results validate the
correctness of our estimator, and demonstrate its utility in canonical
calibration error estimation and calibration error regularized risk
minimization.
Related papers
- Orthogonal Causal Calibration [55.28164682911196]
We prove generic upper bounds on the calibration error of any causal parameter estimate $theta$ with respect to any loss $ell$.
We use our bound to analyze the convergence of two sample splitting algorithms for causal calibration.
arXiv Detail & Related papers (2024-06-04T03:35:25Z) - Consistent and Asymptotically Unbiased Estimation of Proper Calibration
Errors [23.819464242327257]
We propose a method that allows consistent estimation of all proper calibration errors and refinement terms.
We prove the relation between refinement and f-divergences, which implies information monotonicity in neural networks.
Our experiments validate the claimed properties of the proposed estimator and suggest that the selection of a post-hoc calibration method should be determined by the particular calibration error of interest.
arXiv Detail & Related papers (2023-12-14T01:20:08Z) - Calibration by Distribution Matching: Trainable Kernel Calibration
Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression.
These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization.
We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z) - Sharp Calibrated Gaussian Processes [58.94710279601622]
State-of-the-art approaches for designing calibrated models rely on inflating the Gaussian process posterior variance.
We present a calibration approach that generates predictive quantiles using a computation inspired by the vanilla Gaussian process posterior variance.
Our approach is shown to yield a calibrated model under reasonable assumptions.
arXiv Detail & Related papers (2023-02-23T12:17:36Z) - Class-wise and reduced calibration methods [0.0]
We show how a reduced calibration method transforms the original problem into a simpler one.
Second, we propose class-wise calibration methods, based on building on a phenomenon called neural collapse.
Applying the two methods together results in class-wise reduced calibration algorithms, which are powerful tools for reducing the prediction and per-class calibration errors.
arXiv Detail & Related papers (2022-10-07T17:13:17Z) - Parametric and Multivariate Uncertainty Calibration for Regression and
Object Detection [4.630093015127541]
We show that common detection models overestimate the spatial uncertainty in comparison to the observed error.
Our experiments show that the simple Isotonic Regression recalibration method is sufficient to achieve a good calibrated uncertainty.
In contrast, if normal distributions are required for subsequent processes, our GP-Normal recalibration method yields the best results.
arXiv Detail & Related papers (2022-07-04T08:00:20Z) - T-Cal: An optimal test for the calibration of predictive models [49.11538724574202]
We consider detecting mis-calibration of predictive models using a finite validation dataset as a hypothesis testing problem.
detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions.
We propose T-Cal, a minimax test for calibration based on a de-biased plug-in estimator of the $ell$-Expected Error (ECE)
arXiv Detail & Related papers (2022-03-03T16:58:54Z) - Localized Calibration: Metrics and Recalibration [133.07044916594361]
We propose a fine-grained calibration metric that spans the gap between fully global and fully individualized calibration.
We then introduce a localized recalibration method, LoRe, that improves the LCE better than existing recalibration methods.
arXiv Detail & Related papers (2021-02-22T07:22:12Z) - Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions.
We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test.
Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.