Related papers: Are Fast Methods Stable in Adversarially Robust Transfer Learning?

Are Fast Methods Stable in Adversarially Robust Transfer Learning?

URL: http://arxiv.org/abs/2506.22602v1
Date: Fri, 27 Jun 2025 19:53:53 GMT
Title: Are Fast Methods Stable in Adversarially Robust Transfer Learning?
Authors: Joshua C. Zhao, Saurabh Bagchi,
Abstract summary: We revisit the use of the fast gradient sign method (FGSM) in robust transfer learning to improve the computational cost of adversarial fine-tuning.<n>FGSM is much more stable in adversarial fine-tuning than when training from scratch.<n>We demonstrate how this stability translates into performance across multiple datasets.
Score: 5.566459664476929
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transfer learning is often used to decrease the computational cost of model training, as fine-tuning a model allows a downstream task to leverage the features learned from the pre-training dataset and quickly adapt them to a new task. This is particularly useful for achieving adversarial robustness, as adversarially training models from scratch is very computationally expensive. However, high robustness in transfer learning still requires adversarial training during the fine-tuning phase, which requires up to an order of magnitude more time than standard fine-tuning. In this work, we revisit the use of the fast gradient sign method (FGSM) in robust transfer learning to improve the computational cost of adversarial fine-tuning. We surprisingly find that FGSM is much more stable in adversarial fine-tuning than when training from scratch. In particular, FGSM fine-tuning does not suffer from any issues with catastrophic overfitting at standard perturbation budgets of $\varepsilon=4$ or $\varepsilon=8$. This stability is further enhanced with parameter-efficient fine-tuning methods, where FGSM remains stable even up to $\varepsilon=32$ for linear probing. We demonstrate how this stability translates into performance across multiple datasets. Compared to fine-tuning with the more commonly used method of projected gradient descent (PGD), on average, FGSM only loses 0.39% and 1.39% test robustness for $\varepsilon=4$ and $\varepsilon=8$ while using $4\times$ less training time. Surprisingly, FGSM may not only be a significantly more efficient alternative to PGD in adversarially robust transfer learning but also a well-performing one.

Related papers

Large Continual Instruction Assistant [59.585544987096974]
Continual Instruction Tuning (CIT) is adopted to instruct Large Models to follow human intent data by data.<n>Existing update gradient would heavily destroy the performance on previous datasets during CIT process.<n>We propose a general continual instruction tuning framework to address the challenge.
arXiv Detail & Related papers (2024-10-08T11:24:59Z)
Initialization Matters for Adversarial Transfer Learning [61.89451332757625]
We discover the necessity of an adversarially robust pretrained model. We propose Robust Linear Initialization (RoLI) for adversarial finetuning, which initializes the linear head with the weights obtained by adversarial linear probing. Across five different image classification datasets, we demonstrate the effectiveness of RoLI and achieve new state-of-the-art results.
arXiv Detail & Related papers (2023-12-10T00:51:05Z)
Improving Representational Continuity via Continued Pretraining [76.29171039601948]
Transfer learning community (LP-FT) outperforms naive training and other continual learning methods. LP-FT also reduces forgetting in a real world satellite remote sensing dataset (FMoW) variant of LP-FT gets state-of-the-art accuracies on an NLP continual learning benchmark.
arXiv Detail & Related papers (2023-02-26T10:39:38Z)
Differentially Private Image Classification from Features [53.75086935617644]
Leveraging transfer learning has been shown to be an effective strategy for training large models with Differential Privacy. Recent works have found that privately training just the last layer of a pre-trained model provides the best utility with DP.
arXiv Detail & Related papers (2022-11-24T04:04:20Z)
Stability of Accuracy for the Training of DNNs Via the Uniform Doubling Condition [0.0]
We study the stability of accuracy during the training of deep neural networks (DNNs) The goal of achieving stability of accuracy is to ensure that if accuracy is high at some initial time, it remains high throughout training.
arXiv Detail & Related papers (2022-10-16T02:42:42Z)
Stability Analysis and Generalization Bounds of Adversarial Training [31.50956388020211]
In adversarial machine learning, deep neural networks can fit the adversarial examples on the training dataset but have poor generalization on the test set. This phenomenon is called robust overfitting, and it can be observed when adversarially training neural nets on common datasets.
arXiv Detail & Related papers (2022-10-03T14:21:46Z)
Not All Models Are Equal: Predicting Model Transferability in a Self-challenging Fisher Space [51.62131362670815]
This paper addresses the problem of ranking the pre-trained deep neural networks and screening the most transferable ones for downstream tasks. It proposes a new transferability metric called textbfSelf-challenging textbfFisher textbfDiscriminant textbfAnalysis (textbfSFDA)
arXiv Detail & Related papers (2022-07-07T01:33:25Z)
Towards Alternative Techniques for Improving Adversarial Robustness: Analysis of Adversarial Training at a Spectrum of Perturbations [5.18694590238069]
Adversarial training (AT) and its variants have spearheaded progress in improving neural network robustness to adversarial perturbations. We focus on models, trained on a spectrum of $epsilon$ values. We identify alternative improvements to AT that otherwise wouldn't have been apparent at a single $epsilon$.
arXiv Detail & Related papers (2022-06-13T22:01:21Z)
Understanding the Difficulty of Training Transformers [120.99980924577787]
We show that unbalanced gradients are not the root cause of the instability of training. We propose Admin to stabilize the early stage's training and unleash its full potential in the late stage.
arXiv Detail & Related papers (2020-04-17T13:59:07Z)
Fast is better than free: Revisiting adversarial training [86.11788847990783]
We show that it is possible to train empirically robust models using a much weaker and cheaper adversary. We identify a failure mode referred to as "catastrophic overfitting" which may have caused previous attempts to use FGSM adversarial training to fail.
arXiv Detail & Related papers (2020-01-12T20:30:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.