Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics
- URL: http://arxiv.org/abs/2210.07152v1
- Date: Thu, 13 Oct 2022 16:34:55 GMT
- Title: Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics
- Authors: Dean P. Foster and Sergiu Hart
- Abstract summary: We propose to smooth out the calibration score, which measures how good a forecaster is, by combining nearby forecasts.
We show that smooth calibration can be guaranteed by deterministic procedures.
- Score: 8.858351266850544
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose to smooth out the calibration score, which measures how good a
forecaster is, by combining nearby forecasts. While regular calibration can be
guaranteed only by randomized forecasting procedures, we show that smooth
calibration can be guaranteed by deterministic procedures. As a consequence, it
does not matter if the forecasts are leaked, i.e., made known in advance:
smooth calibration can nevertheless be guaranteed (while regular calibration
cannot). Moreover, our procedure has finite recall, is stationary, and all
forecasts lie on a finite grid. To construct the procedure, we deal also with
the related setups of online linear regression and weak calibration. Finally,
we show that smooth calibration yields uncoupled finite-memory dynamics in
n-person games "smooth calibrated learning" in which the players play
approximate Nash equilibria in almost all periods (by contrast, calibrated
learning, which uses regular calibration, yields only that the time-averages of
play are approximate correlated equilibria).
Related papers
- Truthfulness of Calibration Measures [18.21682539787221]
A calibration measure is said to be truthful if the forecaster minimizes expected penalty by predicting the conditional expectation of the next outcome.
This makes it an essential desideratum for calibration measures, alongside typical requirements, such as soundness and completeness.
We introduce a new calibration measure termed the Subsampled Smooth Error (SSCE) under which truthful prediction is optimal up to a constant multiplicative factor.
arXiv Detail & Related papers (2024-07-19T02:07:55Z) - Towards Certification of Uncertainty Calibration under Adversarial Attacks [96.48317453951418]
We show that attacks can significantly harm calibration, and thus propose certified calibration as worst-case bounds on calibration under adversarial perturbations.
We propose novel calibration attacks and demonstrate how they can improve model calibration through textitadversarial calibration training
arXiv Detail & Related papers (2024-05-22T18:52:09Z) - Calibration by Distribution Matching: Trainable Kernel Calibration
Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression.
These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization.
We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z) - Causal isotonic calibration for heterogeneous treatment effects [0.5249805590164901]
We propose causal isotonic calibration, a novel nonparametric method for calibrating predictors of heterogeneous treatment effects.
We also introduce cross-calibration, a data-efficient variant of calibration that eliminates the need for hold-out calibration sets.
arXiv Detail & Related papers (2023-02-27T18:07:49Z) - Sharp Calibrated Gaussian Processes [58.94710279601622]
State-of-the-art approaches for designing calibrated models rely on inflating the Gaussian process posterior variance.
We present a calibration approach that generates predictive quantiles using a computation inspired by the vanilla Gaussian process posterior variance.
Our approach is shown to yield a calibrated model under reasonable assumptions.
arXiv Detail & Related papers (2023-02-23T12:17:36Z) - Forecast Hedging and Calibration [8.858351266850544]
We develop the concept of forecast hedging, which consists of choosing the forecasts so as to guarantee the expected track record can only improve.
This yields all the calibration results by the same simple argument while differentiating between them by the forecast-hedging tools used.
Additional contributions are an improved definition of continuous calibration, ensuing game dynamics that yield Nashlibria in the long run, and a new forecasting procedure for binary events that is simpler than all known such procedures.
arXiv Detail & Related papers (2022-10-13T16:48:25Z) - T-Cal: An optimal test for the calibration of predictive models [49.11538724574202]
We consider detecting mis-calibration of predictive models using a finite validation dataset as a hypothesis testing problem.
detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions.
We propose T-Cal, a minimax test for calibration based on a de-biased plug-in estimator of the $ell$-Expected Error (ECE)
arXiv Detail & Related papers (2022-03-03T16:58:54Z) - Localized Calibration: Metrics and Recalibration [133.07044916594361]
We propose a fine-grained calibration metric that spans the gap between fully global and fully individualized calibration.
We then introduce a localized recalibration method, LoRe, that improves the LCE better than existing recalibration methods.
arXiv Detail & Related papers (2021-02-22T07:22:12Z) - Unsupervised Calibration under Covariate Shift [92.02278658443166]
We introduce the problem of calibration under domain shift and propose an importance sampling based approach to address it.
We evaluate and discuss the efficacy of our method on both real-world datasets and synthetic datasets.
arXiv Detail & Related papers (2020-06-29T21:50:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.