On robust overfitting: adversarial training induced distribution matters
- URL: http://arxiv.org/abs/2311.16526v2
- Date: Sat, 10 Feb 2024 13:11:49 GMT
- Title: On robust overfitting: adversarial training induced distribution matters
- Authors: Runzhi Tian, Yongyi Mao
- Abstract summary: Adversarial training may be regarded as standard training with a modified loss function.
But its generalization error appears much larger than standard training under standard loss.
This phenomenon, known as robust overfitting, has attracted significant research attention and remains largely as a mystery.
- Score: 32.501773057885735
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training may be regarded as standard training with a modified
loss function. But its generalization error appears much larger than standard
training under standard loss. This phenomenon, known as robust overfitting, has
attracted significant research attention and remains largely as a mystery. In
this paper, we first show empirically that robust overfitting correlates with
the increasing generalization difficulty of the perturbation-induced
distributions along the trajectory of adversarial training (specifically
PGD-based adversarial training). We then provide a novel upper bound for
generalization error with respect to the perturbation-induced distributions, in
which a notion of the perturbation operator, referred to "local dispersion",
plays an important role. Experimental results are presented to validate the
usefulness of the bound and various additional insights are provided.
Related papers
- Understanding Generalization in Transformers: Error Bounds and Training Dynamics Under Benign and Harmful Overfitting [36.149708427591534]
We develop a generalization theory for a two-layer transformer with labeled flip noise.
We present generalization error bounds for both benign and harmful overfitting under varying signal-to-noise ratios.
We conduct extensive experiments to identify key factors that influence test errors in transformers.
arXiv Detail & Related papers (2025-02-18T03:46:01Z) - Enhancing Robust Fairness via Confusional Spectral Regularization [6.041034366572273]
We derive a robust generalization bound for the worst-class robust error within the PAC-Bayesian framework.
We propose a novel regularization technique to improve worst-class robust accuracy and enhance robust fairness.
arXiv Detail & Related papers (2025-01-22T23:32:19Z) - Generalizing to any diverse distribution: uniformity, gentle finetuning and rebalancing [55.791818510796645]
We aim to develop models that generalize well to any diverse test distribution, even if the latter deviates significantly from the training data.
Various approaches like domain adaptation, domain generalization, and robust optimization attempt to address the out-of-distribution challenge.
We adopt a more conservative perspective by accounting for the worst-case error across all sufficiently diverse test distributions within a known domain.
arXiv Detail & Related papers (2024-10-08T12:26:48Z) - Benign Overfitting in Adversarially Robust Linear Classification [91.42259226639837]
"Benign overfitting", where classifiers memorize noisy training data yet still achieve a good generalization performance, has drawn great attention in the machine learning community.
We show that benign overfitting indeed occurs in adversarial training, a principled approach to defend against adversarial examples.
arXiv Detail & Related papers (2021-12-31T00:27:31Z) - Unsupervised Learning of Debiased Representations with Pseudo-Attributes [85.5691102676175]
We propose a simple but effective debiasing technique in an unsupervised manner.
We perform clustering on the feature embedding space and identify pseudoattributes by taking advantage of the clustering results.
We then employ a novel cluster-based reweighting scheme for learning debiased representation.
arXiv Detail & Related papers (2021-08-06T05:20:46Z) - Understanding Generalization in Adversarial Training via the
Bias-Variance Decomposition [39.108491135488286]
We decompose the test risk into its bias and variance components.
We find that the bias increases monotonically with perturbation size and is the dominant term in the risk.
We show that popular explanations for the generalization gap instead predict the variance to be monotonic.
arXiv Detail & Related papers (2021-03-17T23:30:00Z) - In Search of Robust Measures of Generalization [79.75709926309703]
We develop bounds on generalization error, optimization error, and excess risk.
When evaluated empirically, most of these bounds are numerically vacuous.
We argue that generalization measures should instead be evaluated within the framework of distributional robustness.
arXiv Detail & Related papers (2020-10-22T17:54:25Z) - The Role of Mutual Information in Variational Classifiers [47.10478919049443]
We study the generalization error of classifiers relying on encodings trained on the cross-entropy loss.
We derive bounds to the generalization error showing that there exists a regime where the generalization error is bounded by the mutual information.
arXiv Detail & Related papers (2020-10-22T12:27:57Z) - Understanding and Mitigating the Tradeoff Between Robustness and
Accuracy [88.51943635427709]
Adversarial training augments the training set with perturbations to improve the robust error.
We show that the standard error could increase even when the augmented perturbations have noiseless observations from the optimal linear predictor.
arXiv Detail & Related papers (2020-02-25T08:03:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.