Related papers: Benign Overfitting in Adversarially Robust Linear Classification

Benign Overfitting in Adversarially Robust Linear Classification

URL: http://arxiv.org/abs/2112.15250v1
Date: Fri, 31 Dec 2021 00:27:31 GMT
Title: Benign Overfitting in Adversarially Robust Linear Classification
Authors: Jinghui Chen and Yuan Cao and Quanquan Gu
Abstract summary: "Benign overfitting", where classifiers memorize noisy training data yet still achieve a good generalization performance, has drawn great attention in the machine learning community. We show that benign overfitting indeed occurs in adversarial training, a principled approach to defend against adversarial examples.
Score: 91.42259226639837
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: "Benign overfitting", where classifiers memorize noisy training data yet still achieve a good generalization performance, has drawn great attention in the machine learning community. To explain this surprising phenomenon, a series of works have provided theoretical justification in over-parameterized linear regression, classification, and kernel methods. However, it is not clear if benign overfitting still occurs in the presence of adversarial examples, i.e., examples with tiny and intentional perturbations to fool the classifiers. In this paper, we show that benign overfitting indeed occurs in adversarial training, a principled approach to defend against adversarial examples. In detail, we prove the risk bounds of the adversarially trained linear classifier on the mixture of sub-Gaussian data under $\ell_p$ adversarial perturbations. Our result suggests that under moderate perturbations, adversarially trained linear classifiers can achieve the near-optimal standard and adversarial risks, despite overfitting the noisy training data. Numerical experiments validate our theoretical findings.

Related papers

Generalization Properties of Adversarial Training for $\ell_0$-Bounded Adversarial Attacks [47.22918498465056]
In this paper, we aim to theoretically characterize the performance of adversarial training for an important class of neural networks. Deriving a generalization in this setting has two main challenges.
arXiv Detail & Related papers (2024-02-05T22:57:33Z)
Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice. We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z)
How adversarial attacks can disrupt seemingly stable accurate classifiers [76.95145661711514]
Adversarial attacks dramatically change the output of an otherwise accurate learning system using a seemingly inconsequential modification to a piece of input data. Here, we show that this may be seen as a fundamental feature of classifiers working with high dimensional input data. We introduce a simple generic and generalisable framework for which key behaviours observed in practical systems arise with high probability.
arXiv Detail & Related papers (2023-09-07T12:02:00Z)
Understanding Noise-Augmented Training for Randomized Smoothing [14.061680807550722]
Randomized smoothing is a technique for providing provable robustness guarantees against adversarial attacks. We show that, without making stronger distributional assumptions, no benefit can be expected from predictors trained with noise-augmentation. Our analysis has direct implications to the practical deployment of randomized smoothing.
arXiv Detail & Related papers (2023-05-08T14:46:34Z)
Classification and Adversarial examples in an Overparameterized Linear Model: A Signal Processing Perspective [10.515544361834241]
State-of-the-art deep learning classifiers are highly susceptible to infinitesmal adversarial perturbations. We find that the learned model is susceptible to adversaries in an intermediate regime where classification generalizes but regression does not. Despite the adversarial susceptibility, we find that classification with these features can be easier than the more commonly studied "independent feature" models.
arXiv Detail & Related papers (2021-09-27T17:35:42Z)
RATT: Leveraging Unlabeled Data to Guarantee Generalization [96.08979093738024]
We introduce a method that leverages unlabeled data to produce generalization bounds. We prove that our bound is valid for 0-1 empirical risk minimization. This work provides practitioners with an option for certifying the generalization of deep nets even when unseen labeled data is unavailable.
arXiv Detail & Related papers (2021-05-01T17:05:29Z)
Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures [100.55816326422773]
We study the phenomenon of the maximum margin classifier for linear classification problems. Our results precisely characterize the condition under which benign overfitting can occur.
arXiv Detail & Related papers (2021-04-28T08:25:16Z)
Asymptotic Behavior of Adversarial Training in Binary Classification [41.7567932118769]
Adversarial training is considered to be the state-of-the-art method for defense against adversarial attacks. Despite being successful in practice, several problems in understanding performance of adversarial training remain open. We derive precise theoretical predictions for the minimization of adversarial training in binary classification.
arXiv Detail & Related papers (2020-10-26T01:44:20Z)
ATRO: Adversarial Training with a Rejection Option [10.36668157679368]
This paper proposes a classification framework with a rejection option to mitigate the performance deterioration caused by adversarial examples. Applying the adversarial training objective to both a classifier and a rejection function simultaneously, we can choose to abstain from classification when it has insufficient confidence to classify a test data point.
arXiv Detail & Related papers (2020-10-24T14:05:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.