Minimum-Risk Recalibration of Classifiers
- URL: http://arxiv.org/abs/2305.10886v1
- Date: Thu, 18 May 2023 11:27:02 GMT
- Title: Minimum-Risk Recalibration of Classifiers
- Authors: Zeyu Sun, Dogyoon Song and Alfred Hero
- Abstract summary: We introduce the concept of minimum-risk recalibration within the framework of mean-squared-error decomposition.
We show that transferring a calibrated classifier requires significantly fewer target samples compared to recalibrating from scratch.
- Score: 9.31067660373791
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recalibrating probabilistic classifiers is vital for enhancing the
reliability and accuracy of predictive models. Despite the development of
numerous recalibration algorithms, there is still a lack of a comprehensive
theory that integrates calibration and sharpness (which is essential for
maintaining predictive power). In this paper, we introduce the concept of
minimum-risk recalibration within the framework of mean-squared-error (MSE)
decomposition, offering a principled approach for evaluating and recalibrating
probabilistic classifiers. Using this framework, we analyze the uniform-mass
binning (UMB) recalibration method and establish a finite-sample risk upper
bound of order $\tilde{O}(B/n + 1/B^2)$ where $B$ is the number of bins and $n$
is the sample size. By balancing calibration and sharpness, we further
determine that the optimal number of bins for UMB scales with $n^{1/3}$,
resulting in a risk bound of approximately $O(n^{-2/3})$. Additionally, we
tackle the challenge of label shift by proposing a two-stage approach that
adjusts the recalibration function using limited labeled data from the target
domain. Our results show that transferring a calibrated classifier requires
significantly fewer target samples compared to recalibrating from scratch. We
validate our theoretical findings through numerical simulations, which confirm
the tightness of the proposed bounds, the optimal number of bins, and the
effectiveness of label shift adaptation.
Related papers
- Low-Rank Approximation of Structural Redundancy for Self-Supervised Learning [2.3072402651280517]
We study the data-generating mechanism for reconstructive SSL to shed light on its effectiveness.
With an infinite amount of labeled samples, we provide a sufficient and necessary condition for perfect linear approximation.
Motivated by the condition, we propose to approximate the redundant component by a low-rank factorization.
arXiv Detail & Related papers (2024-02-10T04:45:27Z) - Classifier Calibration with ROC-Regularized Isotonic Regression [0.0]
We use isotonic regression to minimize the cross entropy on a calibration set via monotone transformations.
IR acts as an adaptive binning procedure, which allows achieving a calibration error of zero, but leaves open the issue of the effect on performance.
We show empirically that this general monotony criterion is effective in striking a balance between reducing cross entropy loss and avoiding overfitting of the calibration set.
arXiv Detail & Related papers (2023-11-21T08:45:09Z) - Calibration by Distribution Matching: Trainable Kernel Calibration
Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression.
These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization.
We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z) - The Lipschitz-Variance-Margin Tradeoff for Enhanced Randomized Smoothing [85.85160896547698]
Real-life applications of deep neural networks are hindered by their unsteady predictions when faced with noisy inputs and adversarial attacks.
We show how to design an efficient classifier with a certified radius by relying on noise injection into the inputs.
Our novel certification procedure allows us to use pre-trained models with randomized smoothing, effectively improving the current certification radius in a zero-shot manner.
arXiv Detail & Related papers (2023-09-28T22:41:47Z) - A Consistent and Differentiable Lp Canonical Calibration Error Estimator [21.67616079217758]
Deep neural networks are poorly calibrated and tend to output overconfident predictions.
We propose a low-bias, trainable calibration error estimator based on Dirichlet kernel density estimates.
Our method has a natural choice of kernel, and can be used to generate consistent estimates of other quantities.
arXiv Detail & Related papers (2022-10-13T15:11:11Z) - MBCT: Tree-Based Feature-Aware Binning for Individual Uncertainty
Calibration [29.780204566046503]
We propose a feature-aware binning framework, called Multiple Boosting Trees (MBCT)
Our MBCT is non-monotonic, and has the potential to improve order accuracy, due to its learnable binning scheme and the individual calibration.
Results show that our method outperforms all competing models in terms of both calibration error and order accuracy.
arXiv Detail & Related papers (2022-02-09T08:59:16Z) - Amortized Conditional Normalized Maximum Likelihood: Reliable Out of
Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation.
Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle.
We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z) - Large-Scale Methods for Distributionally Robust Optimization [53.98643772533416]
We prove that our algorithms require a number of evaluations gradient independent of training set size and number of parameters.
Experiments on MNIST and ImageNet confirm the theoretical scaling of our algorithms, which are 9--36 times more efficient than full-batch methods.
arXiv Detail & Related papers (2020-10-12T17:41:44Z) - Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions.
We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test.
Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z) - Breaking the Sample Size Barrier in Model-Based Reinforcement Learning
with a Generative Model [50.38446482252857]
This paper is concerned with the sample efficiency of reinforcement learning, assuming access to a generative model (or simulator)
We first consider $gamma$-discounted infinite-horizon Markov decision processes (MDPs) with state space $mathcalS$ and action space $mathcalA$.
We prove that a plain model-based planning algorithm suffices to achieve minimax-optimal sample complexity given any target accuracy level.
arXiv Detail & Related papers (2020-05-26T17:53:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.