When Does Optimizing a Proper Loss Yield Calibration?
- URL: http://arxiv.org/abs/2305.18764v2
- Date: Fri, 8 Dec 2023 05:05:06 GMT
- Title: When Does Optimizing a Proper Loss Yield Calibration?
- Authors: Jaros{\l}aw B{\l}asiok, Parikshit Gopalan, Lunjia Hu, Preetum Nakkiran
- Abstract summary: We show that any predictor with a local optimality satisfies smooth calibration.
We also show that the connection between local optimality and calibration error goes both ways.
- Score: 12.684962113589515
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Optimizing proper loss functions is popularly believed to yield predictors
with good calibration properties; the intuition being that for such losses, the
global optimum is to predict the ground-truth probabilities, which is indeed
calibrated. However, typical machine learning models are trained to
approximately minimize loss over restricted families of predictors, that are
unlikely to contain the ground truth. Under what circumstances does optimizing
proper loss over a restricted family yield calibrated models? What precise
calibration guarantees does it give? In this work, we provide a rigorous answer
to these questions. We replace the global optimality with a local optimality
condition stipulating that the (proper) loss of the predictor cannot be reduced
much by post-processing its predictions with a certain family of Lipschitz
functions. We show that any predictor with this local optimality satisfies
smooth calibration as defined in Kakade-Foster (2008), B{\l}asiok et al.
(2023). Local optimality is plausibly satisfied by well-trained DNNs, which
suggests an explanation for why they are calibrated from proper loss
minimization alone. Finally, we show that the connection between local
optimality and calibration error goes both ways: nearly calibrated predictors
are also nearly locally optimal.
Related papers
- Calibrating Deep Neural Network using Euclidean Distance [5.675312975435121]
In machine learning, Focal Loss is commonly used to reduce misclassification rates by emphasizing hard-to-classify samples.
High calibration error indicates a misalignment between predicted probabilities and actual outcomes, affecting model reliability.
This research introduces a novel loss function called Focal Loss (FCL), designed to improve probability calibration while retaining the advantages of Focal Loss in handling difficult samples.
arXiv Detail & Related papers (2024-10-23T23:06:50Z) - Orthogonal Causal Calibration [55.28164682911196]
We prove generic upper bounds on the calibration error of any causal parameter estimate $theta$ with respect to any loss $ell$.
We use our bound to analyze the convergence of two sample splitting algorithms for causal calibration.
arXiv Detail & Related papers (2024-06-04T03:35:25Z) - Optimizing Calibration by Gaining Aware of Prediction Correctness [30.619608580138802]
Cross-Entropy (CE) loss is widely used for calibrator training, which enforces the model to increase confidence on the ground truth class.
We propose a new post-hoc calibration objective derived from the aim of calibration.
arXiv Detail & Related papers (2024-04-19T17:25:43Z) - Calibration by Distribution Matching: Trainable Kernel Calibration
Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression.
These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization.
We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z) - Sharp Calibrated Gaussian Processes [58.94710279601622]
State-of-the-art approaches for designing calibrated models rely on inflating the Gaussian process posterior variance.
We present a calibration approach that generates predictive quantiles using a computation inspired by the vanilla Gaussian process posterior variance.
Our approach is shown to yield a calibrated model under reasonable assumptions.
arXiv Detail & Related papers (2023-02-23T12:17:36Z) - Better Uncertainty Calibration via Proper Scores for Classification and
Beyond [15.981380319863527]
We introduce the framework of proper calibration errors, which relates every calibration error to a proper score.
This relationship can be used to reliably quantify the model calibration improvement.
arXiv Detail & Related papers (2022-03-15T12:46:08Z) - T-Cal: An optimal test for the calibration of predictive models [49.11538724574202]
We consider detecting mis-calibration of predictive models using a finite validation dataset as a hypothesis testing problem.
detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions.
We propose T-Cal, a minimax test for calibration based on a de-biased plug-in estimator of the $ell$-Expected Error (ECE)
arXiv Detail & Related papers (2022-03-03T16:58:54Z) - Online Calibrated and Conformal Prediction Improves Bayesian Optimization [10.470326550507117]
This paper studies which uncertainties are needed in model-based decision-making and in Bayesian optimization.
Maintaining calibration, however, can be challenging when the data is non-stationary and depends on our actions.
We propose using simple algorithms based on online learning to provably maintain calibration on non-i.i.d. data.
arXiv Detail & Related papers (2021-12-08T23:26:23Z) - Localized Calibration: Metrics and Recalibration [133.07044916594361]
We propose a fine-grained calibration metric that spans the gap between fully global and fully individualized calibration.
We then introduce a localized recalibration method, LoRe, that improves the LCE better than existing recalibration methods.
arXiv Detail & Related papers (2021-02-22T07:22:12Z) - Unsupervised Calibration under Covariate Shift [92.02278658443166]
We introduce the problem of calibration under domain shift and propose an importance sampling based approach to address it.
We evaluate and discuss the efficacy of our method on both real-world datasets and synthetic datasets.
arXiv Detail & Related papers (2020-06-29T21:50:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.