Calibrated Surrogate Losses for Adversarially Robust Classification
- URL: http://arxiv.org/abs/2005.13748v2
- Date: Thu, 13 May 2021 09:56:30 GMT
- Title: Calibrated Surrogate Losses for Adversarially Robust Classification
- Authors: Han Bao, Clayton Scott, Masashi Sugiyama
- Abstract summary: We show that no convex surrogate loss is respect with respect to adversarial 0-1 loss when restricted to linear models.
We also show that if the underlying distribution satisfies the Massart's noise condition, convex losses can also be calibrated in the adversarial setting.
- Score: 92.37268323142307
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarially robust classification seeks a classifier that is insensitive to
adversarial perturbations of test patterns. This problem is often formulated
via a minimax objective, where the target loss is the worst-case value of the
0-1 loss subject to a bound on the size of perturbation. Recent work has
proposed convex surrogates for the adversarial 0-1 loss, in an effort to make
optimization more tractable. A primary question is that of consistency, that
is, whether minimization of the surrogate risk implies minimization of the
adversarial 0-1 risk. In this work, we analyze this question through the lens
of calibration, which is a pointwise notion of consistency. We show that no
convex surrogate loss is calibrated with respect to the adversarial 0-1 loss
when restricted to the class of linear models. We further introduce a class of
nonconvex losses and offer necessary and sufficient conditions for losses in
this class to be calibrated. We also show that if the underlying distribution
satisfies Massart's noise condition, convex losses can also be calibrated in
the adversarial setting.
Related papers
- Shrinking Class Space for Enhanced Certainty in Semi-Supervised Learning [59.44422468242455]
We propose a novel method dubbed ShrinkMatch to learn uncertain samples.
For each uncertain sample, it adaptively seeks a shrunk class space, which merely contains the original top-1 class.
We then impose a consistency regularization between a pair of strongly and weakly augmented samples in the shrunk space to strive for discriminative representations.
arXiv Detail & Related papers (2023-08-13T14:05:24Z) - STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection [80.04000067312428]
We propose a Self-adapTive Ambiguity Reduction (STAR) loss by exploiting the properties of semantic ambiguity.
We find that semantic ambiguity results in the anisotropic predicted distribution, which inspires us to use predicted distribution to represent semantic ambiguity.
We also propose two kinds of eigenvalue restriction methods that could avoid both distribution's abnormal change and the model's premature convergence.
arXiv Detail & Related papers (2023-06-05T10:33:25Z) - The Adversarial Consistency of Surrogate Risks for Binary Classification [20.03511985572199]
adversarial training seeks to minimize the expected $0$-$1$ loss when each example can be maliciously corrupted within a small ball.
We give a simple and complete characterization of the set of surrogate loss functions that are consistent.
Our results reveal that the class of adversarially consistent surrogates is substantially smaller than in the standard setting.
arXiv Detail & Related papers (2023-05-17T05:27:40Z) - Towards Consistency in Adversarial Classification [17.91058673844592]
We study the problem of consistency in the context of adversarial examples.
We show that no convex surrogate loss can be consistent or calibrated in this context.
arXiv Detail & Related papers (2022-05-20T08:30:06Z) - Constrained Classification and Policy Learning [0.0]
We study consistency of surrogate loss procedures under a constrained set of classifiers.
We show that hinge losses are the only surrogate losses that preserve consistency in second-best scenarios.
arXiv Detail & Related papers (2021-06-24T10:43:00Z) - Lower-bounded proper losses for weakly supervised classification [73.974163801142]
We discuss the problem of weakly supervised learning of classification, in which instances are given weak labels.
We derive a representation theorem for proper losses in supervised learning, which dualizes the Savage representation.
We experimentally demonstrate the effectiveness of our proposed approach, as compared to improper or unbounded losses.
arXiv Detail & Related papers (2021-03-04T08:47:07Z) - A Symmetric Loss Perspective of Reliable Machine Learning [87.68601212686086]
We review how a symmetric loss can yield robust classification from corrupted labels in balanced error rate (BER) minimization.
We demonstrate how the robust AUC method can benefit natural language processing in the problem where we want to learn only from relevant keywords.
arXiv Detail & Related papers (2021-01-05T06:25:47Z) - On Focal Loss for Class-Posterior Probability Estimation: A Theoretical
Perspective [83.19406301934245]
We first prove that the focal loss is classification-calibrated, i.e., its minimizer surely yields the Bayes-optimal classifier.
We then prove that the focal loss is not strictly proper, i.e., the confidence score of the classifier does not match the true class-posterior probability.
Our proposed transformation significantly improves the accuracy of class-posterior probability estimation.
arXiv Detail & Related papers (2020-11-18T09:36:52Z) - Minimax Classification with 0-1 Loss and Performance Guarantees [4.812718493682455]
Supervised classification techniques use training samples to find classification rules with small expected 0-1 loss.
Conventional methods achieve efficient learning and out-of-sample generalization by minimizing surrogate losses over specific families of rules.
This paper presents minimax risk classifiers (MRCs) that do not rely on a choice of surrogate loss and family of rules.
arXiv Detail & Related papers (2020-10-15T18:11:28Z) - Adversarial Classification via Distributional Robustness with
Wasserstein Ambiguity [12.576828231302134]
Under Wasserstein ambiguity, the model aims to minimize the value-at-risk of misclassification.
We show that, despite the non-marginity of this classification, standard descent methods appear to converger for this problem.
arXiv Detail & Related papers (2020-05-28T07:28:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.