Efficient local linearity regularization to overcome catastrophic
overfitting
- URL: http://arxiv.org/abs/2401.11618v2
- Date: Wed, 28 Feb 2024 16:37:00 GMT
- Title: Efficient local linearity regularization to overcome catastrophic
overfitting
- Authors: Elias Abad Rocamora, Fanghui Liu, Grigorios G. Chrysos, Pablo M.
Olmos, Volkan Cevher
- Abstract summary: Catastrophic overfitting (CO) in single-step adversarial training results in abrupt drops in the adversarial test accuracy (even down to 0%)
We introduce a regularization term, called ELLE, to mitigate CO effectively and efficiently in classical AT evaluations.
- Score: 59.463867084204566
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Catastrophic overfitting (CO) in single-step adversarial training (AT)
results in abrupt drops in the adversarial test accuracy (even down to 0%). For
models trained with multi-step AT, it has been observed that the loss function
behaves locally linearly with respect to the input, this is however lost in
single-step AT. To address CO in single-step AT, several methods have been
proposed to enforce local linearity of the loss via regularization. However,
these regularization terms considerably slow down training due to Double
Backpropagation. Instead, in this work, we introduce a regularization term,
called ELLE, to mitigate CO effectively and efficiently in classical AT
evaluations, as well as some more difficult regimes, e.g., large adversarial
perturbations and long training schedules. Our regularization term can be
theoretically linked to curvature of the loss function and is computationally
cheaper than previous methods by avoiding Double Backpropagation. Our thorough
experimental validation demonstrates that our work does not suffer from CO,
even in challenging settings where previous works suffer from it. We also
notice that adapting our regularization parameter during training (ELLE-A)
greatly improves the performance, specially in large $\epsilon$ setups. Our
implementation is available in https://github.com/LIONS-EPFL/ELLE .
Related papers
- Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Sparse is Enough in Fine-tuning Pre-trained Large Language Models [98.46493578509039]
We propose a gradient-based sparse fine-tuning algorithm, named Sparse Increment Fine-Tuning (SIFT)
We validate its effectiveness on a range of tasks including the GLUE Benchmark and Instruction-tuning.
arXiv Detail & Related papers (2023-12-19T06:06:30Z) - RTRA: Rapid Training of Regularization-based Approaches in Continual
Learning [0.0]
In regularization-based approaches to Catastrophic forgetting(CF), modifications to important training parameters are penalized in subsequent tasks using an appropriate loss function.
We propose the RTRA, a modification to the widely used Elastic Weight Consolidation scheme, using the Natural Gradient for loss function optimization.
Our approach improves the training of regularization-based methods without sacrificing test-data performance.
arXiv Detail & Related papers (2023-12-14T21:51:06Z) - Test-Time Training for Semantic Segmentation with Output Contrastive
Loss [12.535720010867538]
Deep learning-based segmentation models have achieved impressive performance on public benchmarks, but generalizing well to unseen environments remains a major challenge.
This paper introduces Contrastive Loss (OCL), known for its capability to learn robust and generalized representations, to stabilize the adaptation process.
Our method excels even when applied to models initially pre-trained using domain adaptation methods on test domain data, showcasing its resilience and adaptability.
arXiv Detail & Related papers (2023-11-14T03:13:47Z) - Understanding and Combating Robust Overfitting via Input Loss Landscape
Analysis and Regularization [5.1024659285813785]
Adrial training is prone to overfitting, and the cause is far from clear.
We find that robust overfitting results from standard training, specifically the minimization of the clean loss.
We propose a new regularizer to smooth the loss landscape by penalizing the weighted logits variation along the adversarial direction.
arXiv Detail & Related papers (2022-12-09T16:55:30Z) - Intersection of Parallels as an Early Stopping Criterion [64.8387564654474]
We propose a method to spot an early stopping point in the training iterations without the need for a validation set.
For a wide range of learning rates, our method, called Cosine-Distance Criterion (CDC), leads to better generalization on average than all the methods that we compare against.
arXiv Detail & Related papers (2022-08-19T19:42:41Z) - Prior-Guided Adversarial Initialization for Fast Adversarial Training [84.56377396106447]
We investigate the difference between the training processes of adversarial examples (AEs) of Fast adversarial training (FAT) and standard adversarial training (SAT)
We observe that the attack success rate of adversarial examples (AEs) of FAT gets worse gradually in the late training stage, resulting in overfitting.
Based on the observation, we propose a prior-guided FGSM initialization method to avoid overfitting.
The proposed method can prevent catastrophic overfitting and outperform state-of-the-art FAT methods.
arXiv Detail & Related papers (2022-07-18T18:13:10Z) - Fast Adversarial Training with Adaptive Step Size [62.37203478589929]
We study the phenomenon from the perspective of training instances.
We propose a simple but effective method, Adversarial Training with Adaptive Step size (ATAS)
ATAS learns an instancewise adaptive step size that is inversely proportional to its gradient norm.
arXiv Detail & Related papers (2022-06-06T08:20:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.