Related papers: Consistency Conditions for Differentiable Surrogate Losses

Consistency Conditions for Differentiable Surrogate Losses

URL: http://arxiv.org/abs/2505.13760v1
Date: Mon, 19 May 2025 22:17:36 GMT
Title: Consistency Conditions for Differentiable Surrogate Losses
Authors: Drona Khurana, Anish Thilagar, Dhamma Kimpara, Rafael Frongillo,
Abstract summary: We show that indirect elicitation (IE) is still equivalent to calibration for non-polyhedral surrogates.<n>We first prove that under mild conditions, IE and calibration are equivalent for one-dimensional losses in this class.<n>We apply these results to a range of problems to demonstrate the power of IE and strong IE for designing and analyzing consistent differentiable surrogates.
Score: 0.9374652839580183
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The statistical consistency of surrogate losses for discrete prediction tasks is often checked via the condition of calibration. However, directly verifying calibration can be arduous. Recent work shows that for polyhedral surrogates, a less arduous condition, indirect elicitation (IE), is still equivalent to calibration. We give the first results of this type for non-polyhedral surrogates, specifically the class of convex differentiable losses. We first prove that under mild conditions, IE and calibration are equivalent for one-dimensional losses in this class. We construct a counter-example that shows that this equivalence fails in higher dimensions. This motivates the introduction of strong IE, a strengthened form of IE that is equally easy to verify. We establish that strong IE implies calibration for differentiable surrogates and is both necessary and sufficient for strongly convex, differentiable surrogates. Finally, we apply these results to a range of problems to demonstrate the power of IE and strong IE for designing and analyzing consistent differentiable surrogates.

Related papers

All Models Are Miscalibrated, But Some Less So: Comparing Calibration with Conditional Mean Operators [12.103487148356747]
We propose a kernel calibration error based on the Hilbert-Schmidt norm of the difference between conditional mean operators.<n>Our experiments show that CKCE provides a more consistent ranking of models by their calibration error and is more robust against distribution shift.
arXiv Detail & Related papers (2025-02-17T05:52:09Z)
Orthogonal Causal Calibration [55.28164682911196]
We develop general algorithms for reducing the task of causal calibration to that of calibrating a standard (non-causal) predictive model.<n>Our results are exceedingly general, showing that essentially any existing calibration algorithm can be used in causal settings.
arXiv Detail & Related papers (2024-06-04T03:35:25Z)
Optimizing Calibration by Gaining Aware of Prediction Correctness [30.619608580138802]
Cross-Entropy (CE) loss is widely used for calibrator training, which enforces the model to increase confidence on the ground truth class.<n>We propose a new post-hoc calibration objective derived from the aim of calibration.
arXiv Detail & Related papers (2024-04-19T17:25:43Z)
Fairness under Covariate Shift: Improving Fairness-Accuracy tradeoff with few Unlabeled Test Samples [21.144077993862652]
We operate in the unsupervised regime where only a small set of unlabeled test samples along with a labeled training set is available. We experimentally verify that optimizing with our loss formulation outperforms a number of state-of-the-art baselines. We show that our proposed method significantly outperforms them.
arXiv Detail & Related papers (2023-10-11T14:39:51Z)
Monotonicity and Double Descent in Uncertainty Estimation with Gaussian Processes [52.92110730286403]
It is commonly believed that the marginal likelihood should be reminiscent of cross-validation metrics and that both should deteriorate with larger input dimensions. We prove that by tuning hyper parameters, the performance, as measured by the marginal likelihood, improves monotonically with the input dimension. We also prove that cross-validation metrics exhibit qualitatively different behavior that is characteristic of double descent.
arXiv Detail & Related papers (2022-10-14T08:09:33Z)
An Embedding Framework for the Design and Analysis of Consistent Polyhedral Surrogates [17.596501992526477]
We study the design of convex surrogate loss functions via embeddings, for problems such as classification, ranking, or structured links. An embedding gives rise to a consistent link function as well as a consistent link function. Our results are constructive, as we illustrate several examples.
arXiv Detail & Related papers (2022-06-29T15:16:51Z)
Calibration and Consistency of Adversarial Surrogate Losses [46.04004505351902]
Adrialversa robustness is an increasingly critical property of classifiers in applications. But which surrogate losses should be used and when do they benefit from theoretical guarantees? We present an extensive study of this question, including a detailed analysis of the H-calibration and H-consistency of adversarial surrogate losses.
arXiv Detail & Related papers (2021-04-19T21:58:52Z)
Transferable Calibration with Lower Bias and Variance in Domain Adaptation [139.4332115349543]
Domain Adaptation (DA) enables transferring a learning machine from a labeled source domain to an unlabeled target one. How to estimate the predictive uncertainty of DA models is vital for decision-making in safety-critical scenarios. TransCal can be easily applied to recalibrate existing DA methods.
arXiv Detail & Related papers (2020-07-16T11:09:36Z)
Calibrated Surrogate Losses for Adversarially Robust Classification [92.37268323142307]
We show that no convex surrogate loss is respect with respect to adversarial 0-1 loss when restricted to linear models. We also show that if the underlying distribution satisfies the Massart's noise condition, convex losses can also be calibrated in the adversarial setting.
arXiv Detail & Related papers (2020-05-28T02:40:42Z)
Calibration of Pre-trained Transformers [55.57083429195445]
We focus on BERT and RoBERTa in this work, and analyze their calibration across three tasks: natural language inference, paraphrase detection, and commonsense reasoning. We show that: (1) when used out-of-the-box, pre-trained models are calibrated in-domain, and compared to baselines, their calibration error out-of-domain can be as much as 3.5x lower; (2) temperature scaling is effective at further reducing calibration error in-domain, and using label smoothing to deliberately increase empirical uncertainty helps calibrate posteriors out-of-domain.
arXiv Detail & Related papers (2020-03-17T18:58:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.