Asymptotic Behavior of Adversarial Training in Binary Classification
- URL: http://arxiv.org/abs/2010.13275v3
- Date: Wed, 14 Jul 2021 01:20:45 GMT
- Title: Asymptotic Behavior of Adversarial Training in Binary Classification
- Authors: Hossein Taheri, Ramtin Pedarsani, and Christos Thrampoulidis
- Abstract summary: Adversarial training is considered to be the state-of-the-art method for defense against adversarial attacks.
Despite being successful in practice, several problems in understanding performance of adversarial training remain open.
We derive precise theoretical predictions for the minimization of adversarial training in binary classification.
- Score: 41.7567932118769
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It has been consistently reported that many machine learning models are
susceptible to adversarial attacks i.e., small additive adversarial
perturbations applied to data points can cause misclassification. Adversarial
training using empirical risk minimization is considered to be the
state-of-the-art method for defense against adversarial attacks. Despite being
successful in practice, several problems in understanding generalization
performance of adversarial training remain open. In this paper, we derive
precise theoretical predictions for the performance of adversarial training in
binary classification. We consider the high-dimensional regime where the
dimension of data grows with the size of the training data-set at a constant
ratio. Our results provide exact asymptotics for standard and adversarial test
errors of the estimators obtained by adversarial training with $\ell_q$-norm
bounded perturbations ($q \ge 1$) for both discriminative binary models and
generative Gaussian-mixture models with correlated features. Furthermore, we
use these sharp predictions to uncover several intriguing observations on the
role of various parameters including the over-parameterization ratio, the data
model, and the attack budget on the adversarial and standard errors.
Related papers
- The Surprising Harmfulness of Benign Overfitting for Adversarial
Robustness [13.120373493503772]
We prove a surprising result that even if the ground truth itself is robust to adversarial examples, the benignly overfitted model is benign in terms of the standard'' out-of-sample risk objective.
Our finding provides theoretical insights into the puzzling phenomenon observed in practice, where the true target function (e.g., human) is robust against adverasrial attack, while beginly overfitted neural networks lead to models that are not robust.
arXiv Detail & Related papers (2024-01-19T15:40:46Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - The curse of overparametrization in adversarial training: Precise
analysis of robust generalization for random features regression [34.35440701530876]
We show that for adversarially trained random features models, high overparametrization can hurt robust generalization.
Our developed theory reveals the nontrivial effect of overparametrization on robustness and indicates that for adversarially trained random features models, high overparametrization can hurt robust generalization.
arXiv Detail & Related papers (2022-01-13T18:57:30Z) - Benign Overfitting in Adversarially Robust Linear Classification [91.42259226639837]
"Benign overfitting", where classifiers memorize noisy training data yet still achieve a good generalization performance, has drawn great attention in the machine learning community.
We show that benign overfitting indeed occurs in adversarial training, a principled approach to defend against adversarial examples.
arXiv Detail & Related papers (2021-12-31T00:27:31Z) - Fundamental Tradeoffs in Distributionally Adversarial Training [21.6024500220438]
Adversarial training is one of the most effective techniques to improve the robustness of models against adversarial perturbations.
In this paper, we study the tradeoff between standard risk and adversarial risk.
We show that a tradeoff between standard and adversarial risk is manifested in all three settings.
arXiv Detail & Related papers (2021-01-15T21:59:18Z) - Precise Statistical Analysis of Classification Accuracies for
Adversarial Training [43.25761725062367]
A variety of recent adversarial training procedures have been proposed to remedy this issue.
We derive a precise characterization of the standard and robust accuracy for a class of minimax adversarially trained models.
arXiv Detail & Related papers (2020-10-21T18:00:53Z) - On the Generalization Properties of Adversarial Training [21.79888306754263]
This paper studies the generalization performance of a generic adversarial training algorithm.
A series of numerical studies are conducted to demonstrate how the smoothness and L1 penalization help improve the adversarial robustness of models.
arXiv Detail & Related papers (2020-08-15T02:32:09Z) - Precise Tradeoffs in Adversarial Training for Linear Regression [55.764306209771405]
We provide a precise and comprehensive understanding of the role of adversarial training in the context of linear regression with Gaussian features.
We precisely characterize the standard/robust accuracy and the corresponding tradeoff achieved by a contemporary mini-max adversarial training approach.
Our theory for adversarial training algorithms also facilitates the rigorous study of how a variety of factors (size and quality of training data, model overparametrization etc.) affect the tradeoff between these two competing accuracies.
arXiv Detail & Related papers (2020-02-24T19:01:47Z) - Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples.
Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks.
In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z) - Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial
Perturbations [65.05561023880351]
Adversarial examples are malicious inputs crafted to induce misclassification.
This paper studies a complementary failure mode, invariance-based adversarial examples.
We show that defenses against sensitivity-based attacks actively harm a model's accuracy on invariance-based attacks.
arXiv Detail & Related papers (2020-02-11T18:50:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.