The Consistency of Adversarial Training for Binary Classification
- URL: http://arxiv.org/abs/2206.09099v2
- Date: Wed, 17 May 2023 05:19:37 GMT
- Title: The Consistency of Adversarial Training for Binary Classification
- Authors: Natalie S. Frank, Jonathan Niles-Weed
- Abstract summary: adversarial training involves minimizing a supremum-based surrogate risk.
We characterize which supremum-based surrogates are consistent for distributions absolutely continuous with respect to Lebesgue measure in binary classification.
- Score: 12.208787849155048
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robustness to adversarial perturbations is of paramount concern in modern
machine learning. One of the state-of-the-art methods for training robust
classifiers is adversarial training, which involves minimizing a supremum-based
surrogate risk. The statistical consistency of surrogate risks is well
understood in the context of standard machine learning, but not in the
adversarial setting. In this paper, we characterize which supremum-based
surrogates are consistent for distributions absolutely continuous with respect
to Lebesgue measure in binary classification. Furthermore, we obtain
quantitative bounds relating adversarial surrogate risks to the adversarial
classification risk. Lastly, we discuss implications for the $\cH$-consistency
of adversarial training.
Related papers
- Learning Fair Robustness via Domain Mixup [8.471466670802817]
We propose the use of mixup for the problem of learning fair robust classifiers.
We show that mixup combined with adversarial training can provably reduce the class-wise robustness disparity.
arXiv Detail & Related papers (2024-11-21T18:56:33Z) - The Pitfalls and Promise of Conformal Inference Under Adversarial Attacks [90.52808174102157]
In safety-critical applications such as medical imaging and autonomous driving, it is imperative to maintain both high adversarial robustness to protect against potential adversarial attacks.
A notable knowledge gap remains concerning the uncertainty inherent in adversarially trained models.
This study investigates the uncertainty of deep learning models by examining the performance of conformal prediction (CP) in the context of standard adversarial attacks.
arXiv Detail & Related papers (2024-05-14T18:05:19Z) - Adversarial Consistency and the Uniqueness of the Adversarial Bayes Classifier [0.0]
Minimizing an adversarial surrogate risk is a common technique for learning robust classifiers.
We show that under reasonable distributional assumptions, a convex surrogate loss is statistically consistent for adversarial learning iff the adversarial Bayes classifier satisfies a certain notion of uniqueness.
arXiv Detail & Related papers (2024-04-26T12:16:08Z) - Generalization Properties of Adversarial Training for $\ell_0$-Bounded
Adversarial Attacks [47.22918498465056]
In this paper, we aim to theoretically characterize the performance of adversarial training for an important class of neural networks.
Deriving a generalization in this setting has two main challenges.
arXiv Detail & Related papers (2024-02-05T22:57:33Z) - Non-Asymptotic Bounds for Adversarial Excess Risk under Misspecified
Models [9.65010022854885]
We show that adversarial risk is equivalent to the risk induced by a distributional adversarial attack under certain smoothness conditions.
To evaluate the generalization performance of the adversarial estimator, we study the adversarial excess risk.
arXiv Detail & Related papers (2023-09-02T00:51:19Z) - Adversarial Training Should Be Cast as a Non-Zero-Sum Game [121.95628660889628]
Two-player zero-sum paradigm of adversarial training has not engendered sufficient levels of robustness.
We show that the commonly used surrogate-based relaxation used in adversarial training algorithms voids all guarantees on robustness.
A novel non-zero-sum bilevel formulation of adversarial training yields a framework that matches and in some cases outperforms state-of-the-art attacks.
arXiv Detail & Related papers (2023-06-19T16:00:48Z) - The Adversarial Consistency of Surrogate Risks for Binary Classification [20.03511985572199]
adversarial training seeks to minimize the expected $0$-$1$ loss when each example can be maliciously corrupted within a small ball.
We give a simple and complete characterization of the set of surrogate loss functions that are consistent.
Our results reveal that the class of adversarially consistent surrogates is substantially smaller than in the standard setting.
arXiv Detail & Related papers (2023-05-17T05:27:40Z) - Existence and Minimax Theorems for Adversarial Surrogate Risks in Binary
Classification [16.626667055542086]
Adversarial training is one of the most popular methods for training methods robust to adversarial attacks.
We prove and existence, regularity, and minimax theorems for adversarial surrogate risks.
arXiv Detail & Related papers (2022-06-18T03:29:49Z) - Benign Overfitting in Adversarially Robust Linear Classification [91.42259226639837]
"Benign overfitting", where classifiers memorize noisy training data yet still achieve a good generalization performance, has drawn great attention in the machine learning community.
We show that benign overfitting indeed occurs in adversarial training, a principled approach to defend against adversarial examples.
arXiv Detail & Related papers (2021-12-31T00:27:31Z) - Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions.
We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples.
We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z) - Towards Robust Fine-grained Recognition by Maximal Separation of
Discriminative Features [72.72840552588134]
We identify the proximity of the latent representations of different classes in fine-grained recognition networks as a key factor to the success of adversarial attacks.
We introduce an attention-based regularization mechanism that maximally separates the discriminative latent features of different classes.
arXiv Detail & Related papers (2020-06-10T18:34:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.