An Embedding Framework for the Design and Analysis of Consistent
Polyhedral Surrogates
- URL: http://arxiv.org/abs/2206.14707v1
- Date: Wed, 29 Jun 2022 15:16:51 GMT
- Title: An Embedding Framework for the Design and Analysis of Consistent
Polyhedral Surrogates
- Authors: Jessie Finocchiaro, Rafael M. Frongillo, Bo Waggoner
- Abstract summary: We study the design of convex surrogate loss functions via embeddings, for problems such as classification, ranking, or structured links.
An embedding gives rise to a consistent link function as well as a consistent link function.
Our results are constructive, as we illustrate several examples.
- Score: 17.596501992526477
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We formalize and study the natural approach of designing convex surrogate
loss functions via embeddings, for problems such as classification, ranking, or
structured prediction. In this approach, one embeds each of the finitely many
predictions (e.g. rankings) as a point in $R^d$, assigns the original loss
values to these points, and "convexifies" the loss in some way to obtain a
surrogate. We establish a strong connection between this approach and
polyhedral (piecewise-linear convex) surrogate losses: every discrete loss is
embedded by some polyhedral loss, and every polyhedral loss embeds some
discrete loss. Moreover, an embedding gives rise to a consistent link function
as well as linear surrogate regret bounds. Our results are constructive, as we
illustrate with several examples. In particular, our framework gives succinct
proofs of consistency or inconsistency for various polyhedral surrogates in the
literature, and for inconsistent surrogates, it further reveals the discrete
losses for which these surrogates are consistent. We go on to show additional
structure of embeddings, such as the equivalence of embedding and matching
Bayes risks, and the equivalence of various notions of non-redudancy. Using
these results, we establish that indirect elicitation, a necessary condition
for consistency, is also sufficient when working with polyhedral surrogates.
Related papers
- The Implicit Bias of Gradient Descent on Separable Multiclass Data [38.05903703331163]
We employ the framework of Permutation Equivariant and Relative Margin-based (PERM) losses to introduce a multiclass extension of the exponential tail property.
Our proof techniques closely mirror those of the binary case, thus illustrating the power of the PERM framework for bridging the binary-multiclass gap.
arXiv Detail & Related papers (2024-11-02T19:39:21Z) - Inference with non-differentiable surrogate loss in a general high-dimensional classification framework [4.792322531593389]
We propose a kernel-smoothed decorrelated score to construct hypothesis testing and interval estimations.
Specifically, we adopt kernel approximations to smooth the discontinuous gradient near discontinuity points.
We establish the limiting distribution of the kernel-smoothed decorrelated score and its cross-fitted version in a high-dimensional setup.
arXiv Detail & Related papers (2024-05-20T01:50:35Z) - Expressive Losses for Verified Robustness via Convex Combinations [67.54357965665676]
We study the relationship between the over-approximation coefficient and performance profiles across different expressive losses.
We show that, while expressivity is essential, better approximations of the worst-case loss are not necessarily linked to superior robustness-accuracy trade-offs.
arXiv Detail & Related papers (2023-05-23T12:20:29Z) - General Loss Functions Lead to (Approximate) Interpolation in High
Dimensions [6.738946307589741]
We provide a unified framework to approximately characterize the implicit bias of gradient descent in closed form.
Specifically, we show that the implicit bias is approximated (but not exactly equal to) the minimum-norm in high dimensions.
Our framework also recovers existing exact equivalence results for exponentially-tailed losses across binary and multiclass settings.
arXiv Detail & Related papers (2023-03-13T21:23:12Z) - On the Importance of Gradient Norm in PAC-Bayesian Bounds [92.82627080794491]
We propose a new generalization bound that exploits the contractivity of the log-Sobolev inequalities.
We empirically analyze the effect of this new loss-gradient norm term on different neural architectures.
arXiv Detail & Related papers (2022-10-12T12:49:20Z) - Bias-Variance Decompositions for Margin Losses [3.574241508860471]
We show that for all strictly convex margin losses, the expected risk decomposes into the risk of a "central" model.
These decompositions provide a diagnostic tool for practitioners to understand model overfitting/underfitting.
arXiv Detail & Related papers (2022-04-26T08:45:43Z) - Calibration and Consistency of Adversarial Surrogate Losses [46.04004505351902]
Adrialversa robustness is an increasingly critical property of classifiers in applications.
But which surrogate losses should be used and when do they benefit from theoretical guarantees?
We present an extensive study of this question, including a detailed analysis of the H-calibration and H-consistency of adversarial surrogate losses.
arXiv Detail & Related papers (2021-04-19T21:58:52Z) - Deconfounding Scores: Feature Representations for Causal Effect
Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation.
We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data.
In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z) - Rethinking preventing class-collapsing in metric learning with
margin-based losses [81.22825616879936]
Metric learning seeks embeddings where visually similar instances are close and dissimilar instances are apart.
margin-based losses tend to project all samples of a class onto a single point in the embedding space.
We propose a simple modification to the embedding losses such that each sample selects its nearest same-class counterpart in a batch.
arXiv Detail & Related papers (2020-06-09T09:59:25Z) - Calibrated Surrogate Losses for Adversarially Robust Classification [92.37268323142307]
We show that no convex surrogate loss is respect with respect to adversarial 0-1 loss when restricted to linear models.
We also show that if the underlying distribution satisfies the Massart's noise condition, convex losses can also be calibrated in the adversarial setting.
arXiv Detail & Related papers (2020-05-28T02:40:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.