QuantProb: Generalizing Probabilities along with Predictions for a Pre-trained Classifier
- URL: http://arxiv.org/abs/2304.12766v2
- Date: Sun, 5 May 2024 04:38:35 GMT
- Title: QuantProb: Generalizing Probabilities along with Predictions for a Pre-trained Classifier
- Authors: Aditya Challa, Snehanshu Saha, Soma Dhavala,
- Abstract summary: We argue that the reason for unreliability of deep networks is - The way neural networks are currently trained, the probabilities do not generalize across small distortions.
We propose an innovative approach to decouple the construction of quantile representations from the loss function allowing us to compute quantile based probabilities without disturbing the original network.
- Score: 1.8488661947561271
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Quantification of Uncertainty in predictions is a challenging problem. In the classification settings, although deep learning based models generalize well, class probabilities often lack reliability. Calibration errors are used to quantify uncertainty, and several methods exist to minimize calibration error. We argue that between the choice of having a minimum calibration error on original distribution which increases across distortions or having a (possibly slightly higher) calibration error which is constant across distortions, we prefer the latter We hypothesize that the reason for unreliability of deep networks is - The way neural networks are currently trained, the probabilities do not generalize across small distortions. We observe that quantile based approaches can potentially solve this problem. We propose an innovative approach to decouple the construction of quantile representations from the loss function allowing us to compute quantile based probabilities without disturbing the original network. We achieve this by establishing a novel duality property between quantiles and probabilities, and an ability to obtain quantile probabilities from any pre-trained classifier. While post-hoc calibration techniques successfully minimize calibration errors, they do not preserve robustness to distortions. We show that, Quantile probabilities (QuantProb), obtained from Quantile representations, preserve the calibration errors across distortions, since quantile probabilities generalize better than the naive Softmax probabilities.
Related papers
- Orthogonal Causal Calibration [55.28164682911196]
We prove generic upper bounds on the calibration error of any causal parameter estimate $theta$ with respect to any loss $ell$.
We use our bound to analyze the convergence of two sample splitting algorithms for causal calibration.
arXiv Detail & Related papers (2024-06-04T03:35:25Z) - Calibration by Distribution Matching: Trainable Kernel Calibration
Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression.
These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization.
We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z) - Multi-Head Multi-Loss Model Calibration [13.841172927454204]
We introduce a form of simplified ensembling that bypasses the costly training and inference of deep ensembles.
Specifically, each head is trained to minimize a weighted Cross-Entropy loss, but the weights are different among the different branches.
We show that the resulting averaged predictions can achieve excellent calibration without sacrificing accuracy in two challenging datasets.
arXiv Detail & Related papers (2023-03-02T09:32:32Z) - Sharp Calibrated Gaussian Processes [58.94710279601622]
State-of-the-art approaches for designing calibrated models rely on inflating the Gaussian process posterior variance.
We present a calibration approach that generates predictive quantiles using a computation inspired by the vanilla Gaussian process posterior variance.
Our approach is shown to yield a calibrated model under reasonable assumptions.
arXiv Detail & Related papers (2023-02-23T12:17:36Z) - Beyond calibration: estimating the grouping loss of modern neural
networks [68.8204255655161]
Proper scoring rule theory shows that given the calibration loss, the missing piece to characterize individual errors is the grouping loss.
We show that modern neural network architectures in vision and NLP exhibit grouping loss, notably in distribution shifts settings.
arXiv Detail & Related papers (2022-10-28T07:04:20Z) - A Consistent and Differentiable Lp Canonical Calibration Error Estimator [21.67616079217758]
Deep neural networks are poorly calibrated and tend to output overconfident predictions.
We propose a low-bias, trainable calibration error estimator based on Dirichlet kernel density estimates.
Our method has a natural choice of kernel, and can be used to generate consistent estimates of other quantities.
arXiv Detail & Related papers (2022-10-13T15:11:11Z) - T-Cal: An optimal test for the calibration of predictive models [49.11538724574202]
We consider detecting mis-calibration of predictive models using a finite validation dataset as a hypothesis testing problem.
detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions.
We propose T-Cal, a minimax test for calibration based on a de-biased plug-in estimator of the $ell$-Expected Error (ECE)
arXiv Detail & Related papers (2022-03-03T16:58:54Z) - $f$-Cal: Calibrated aleatoric uncertainty estimation from neural
networks for robot perception [9.425514903472545]
Existing approaches estimate uncertainty from neural network perception stacks by modifying network architectures, inference procedure, or loss functions.
Our key insight is that calibration is only achieved by imposing constraints across multiple examples, such as those in a mini-batch.
By enforcing the distribution of outputs of a neural network to resemble a target distribution by minimizing an $f$-divergence, we obtain significantly better-calibrated models compared to prior approaches.
arXiv Detail & Related papers (2021-09-28T17:57:58Z) - Beyond Pinball Loss: Quantile Methods for Calibrated Uncertainty
Quantification [15.94100899123465]
A model that predicts the true conditional quantiles for each input, at all quantile levels, presents a correct and efficient representation of the underlying uncertainty.
Current quantile-based methods focus on optimizing the so-called pinball loss.
We develop new quantile methods that address these shortcomings.
arXiv Detail & Related papers (2020-11-18T23:51:23Z) - Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions.
We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test.
Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.