Related papers: Towards Understanding Clean Generalization and Robust Overfitting in Adversarial Training

Towards Understanding Clean Generalization and Robust Overfitting in Adversarial Training

URL: http://arxiv.org/abs/2306.01271v3
Date: Fri, 11 Oct 2024 04:21:59 GMT
Title: Towards Understanding Clean Generalization and Robust Overfitting in Adversarial Training
Authors: Binghui Li, Yuanzhi Li,
Abstract summary: We study the $textitClean Generalization and Robust Overfitting phenomenon in adversarial training. We show that a three-stage phase transition occurs during learning process and the network converges to robust memorization regime. We also empirically verify our theoretical analysis by experiments in real-image recognition.
Score: 38.44734564565478
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Similar to surprising performance in the standard deep learning, deep nets trained by adversarial training also generalize well for $\textit{unseen clean data (natural data)}$. However, despite adversarial training can achieve low robust training error, there exists a significant $\textit{robust generalization gap}$. We call this phenomenon the $\textit{Clean Generalization and Robust Overfitting (CGRO)}$. In this work, we study the CGRO phenomenon in adversarial training from two views: $\textit{representation complexity}$ and $\textit{training dynamics}$. Specifically, we consider a binary classification setting with $N$ separated training data points. $\textit{First}$, we prove that, based on the assumption that we assume there is $\operatorname{poly}(D)$-size clean classifier (where $D$ is the data dimension), ReLU net with only $O(N D)$ extra parameters is able to leverages robust memorization to achieve the CGRO, while robust classifier still requires exponential representation complexity in worst case. $\textit{Next}$, we focus on a structured-data case to analyze training dynamics, where we train a two-layer convolutional network with $O(N D)$ width against adversarial perturbation. We then show that a three-stage phase transition occurs during learning process and the network provably converges to robust memorization regime, which thereby results in the CGRO. $\textit{Besides}$, we also empirically verify our theoretical analysis by experiments in real-image recognition datasets.

Related papers

Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training [8.824077990271503]
We investigate the role of the training dynamics in the transition from generalization to memorization.<n>We find that $tau_mathrmmem$ increases linearly with the training set size $n$, while $tau_mathrmgen$ remains constant.<n>It is only when $n$ becomes larger than a model-dependent threshold that overfitting disappears at infinite training times.
arXiv Detail & Related papers (2025-05-23T08:58:47Z)
Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data [38.44734564565478]
We provide a theoretical understanding of adversarial examples and adversarial training algorithms from the perspective of feature learning theory.<n>We show that the adversarial training method can provably strengthen the robust feature learning and suppress the non-robust feature learning.
arXiv Detail & Related papers (2024-10-11T03:59:49Z)
IT$^3$: Idempotent Test-Time Training [95.78053599609044]
This paper introduces Idempotent Test-Time Training (IT$3$), a novel approach to addressing the challenge of distribution shift. IT$3$ is based on the universal property of idempotence. We demonstrate the versatility of our approach across various tasks, including corrupted image classification.
arXiv Detail & Related papers (2024-10-05T15:39:51Z)
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training [42.89066583603415]
This work identifies three critical $textitO$bstacles: lack of comprehensive evaluation, ($textitO$2) untested viability for scaling, and ($textitO$3) lack of empirical guidelines. We show that a depthwise stacking operator, called $G_textstack$, exhibits remarkable acceleration in training, leading to decreased loss and improved overall performance.
arXiv Detail & Related papers (2024-05-24T08:00:00Z)
Maximal Initial Learning Rates in Deep ReLU Networks [32.157430904535126]
We introduce the maximal initial learning rate $etaast$. We observe that in constant-width fully-connected ReLU networks, $etaast$ behaves differently from the maximum learning rate later in training.
arXiv Detail & Related papers (2022-12-14T15:58:37Z)
Supervised Contrastive Prototype Learning: Augmentation Free Robust Neural Network [17.10753224600936]
Transformations in the input space of Deep Neural Networks (DNN) lead to unintended changes in the feature space. We propose a training framework, $textbfd Contrastive Prototype Learning$ ( SCPL) We use N-pair contrastive loss with prototypes of the same and opposite classes and replace a categorical classification head with a $textbfPrototype Classification Head$ (PCH) Our approach is $textitsample efficient$, does not require $textitsample mining$, can be implemented on any existing DNN without modification to their
arXiv Detail & Related papers (2022-11-26T01:17:15Z)
Characterizing Datapoints via Second-Split Forgetting [93.99363547536392]
We propose $$-second-$split$ $forgetting$ $time$ (SSFT), a complementary metric that tracks the epoch (if any) after which an original training example is forgotten. We demonstrate that $mislabeled$ examples are forgotten quickly, and seemingly $rare$ examples are forgotten comparatively slowly. SSFT can (i) help to identify mislabeled samples, the removal of which improves generalization; and (ii) provide insights about failure modes.
arXiv Detail & Related papers (2022-10-26T21:03:46Z)
Training \beta-VAE by Aggregating a Learned Gaussian Posterior with a Decoupled Decoder [0.553073476964056]
Current practices in VAE training often result in a trade-off between the reconstruction fidelity and the continuity$/$disentanglement of the latent space. We present intuitions and a careful analysis of the antagonistic mechanism of the two losses, and propose a simple yet effective two-stage method for training a VAE. We evaluate the method using a medical dataset intended for 3D skull reconstruction and shape completion, and the results indicate promising generative capabilities of the VAE trained using the proposed method.
arXiv Detail & Related papers (2022-09-29T13:49:57Z)
Explicit Tradeoffs between Adversarial and Natural Distributional Robustness [48.44639585732391]
In practice, models need to enjoy both types of robustness to ensure reliability. In this work, we show that in fact, explicit tradeoffs exist between adversarial and natural distributional robustness.
arXiv Detail & Related papers (2022-09-15T19:58:01Z)
Blessing of Class Diversity in Pre-training [54.335530406959435]
We prove that when the classes of the pre-training task are sufficiently diverse, pre-training can significantly improve the sample efficiency of downstream tasks. Our proof relies on a vector-form Rademacher complexity chain rule for composite function classes and a modified self-concordance condition.
arXiv Detail & Related papers (2022-09-07T20:10:12Z)
Sparsity Winning Twice: Better Robust Generalization from More Efficient Training [94.92954973680914]
We introduce two alternatives for sparse adversarial training: (i) static sparsity and (ii) dynamic sparsity. We find both methods to yield win-win: substantially shrinking the robust generalization gap and alleviating the robust overfitting. Our approaches can be combined with existing regularizers, establishing new state-of-the-art results in adversarial training.
arXiv Detail & Related papers (2022-02-20T15:52:08Z)
Self-Ensembling GAN for Cross-Domain Semantic Segmentation [107.27377745720243]
This paper proposes a self-ensembling generative adversarial network (SE-GAN) exploiting cross-domain data for semantic segmentation. In SE-GAN, a teacher network and a student network constitute a self-ensembling model for generating semantic segmentation maps, which together with a discriminator, forms a GAN. Despite its simplicity, we find SE-GAN can significantly boost the performance of adversarial training and enhance the stability of the model.
arXiv Detail & Related papers (2021-12-15T09:50:25Z)
Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss. We examine how these benign overfitting phenomena occur in a two-layer neural network setting. We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z)
Provable Robustness of Adversarial Training for Learning Halfspaces with Noise [95.84614821570283]
We analyze the properties of adversarial learning adversarially robust halfspaces in the presence of label noise. To the best of our knowledge, this is the first work to show that adversarial training prov yields classifiers in noise.
arXiv Detail & Related papers (2021-04-19T16:35:38Z)
Towards Deep Learning Models Resistant to Large Perturbations [0.0]
Adversarial robustness has proven to be a required property of machine learning algorithms. We show that the well-established algorithm called "adversarial training" fails to train a deep neural network given a large, but reasonable, perturbation magnitude.
arXiv Detail & Related papers (2020-03-30T12:03:09Z)
Overfitting in adversarially robust deep learning [86.11788847990783]
We show that overfitting to the training set does in fact harm robust performance to a very large degree in adversarially robust training. We also show that effects such as the double descent curve do still occur in adversarially trained models, yet fail to explain the observed overfitting.
arXiv Detail & Related papers (2020-02-26T15:40:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.