Improving Generalization of Adversarial Training via Robust Critical
Fine-Tuning
- URL: http://arxiv.org/abs/2308.02533v1
- Date: Tue, 1 Aug 2023 09:02:34 GMT
- Title: Improving Generalization of Adversarial Training via Robust Critical
Fine-Tuning
- Authors: Kaijie Zhu, Jindong Wang, Xixu Hu, Xing Xie, Ge Yang
- Abstract summary: Deep neural networks are susceptible to adversarial examples, posing a significant security risk in critical applications.
This paper proposes Robustness Critical FineTuning (RiFT), a novel approach to enhance generalization without compromising adversarial robustness.
- Score: 19.91117174405902
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks are susceptible to adversarial examples, posing a
significant security risk in critical applications. Adversarial Training (AT)
is a well-established technique to enhance adversarial robustness, but it often
comes at the cost of decreased generalization ability. This paper proposes
Robustness Critical Fine-Tuning (RiFT), a novel approach to enhance
generalization without compromising adversarial robustness. The core idea of
RiFT is to exploit the redundant capacity for robustness by fine-tuning the
adversarially trained model on its non-robust-critical module. To do so, we
introduce module robust criticality (MRC), a measure that evaluates the
significance of a given module to model robustness under worst-case weight
perturbations. Using this measure, we identify the module with the lowest MRC
value as the non-robust-critical module and fine-tune its weights to obtain
fine-tuned weights. Subsequently, we linearly interpolate between the
adversarially trained weights and fine-tuned weights to derive the optimal
fine-tuned model weights. We demonstrate the efficacy of RiFT on ResNet18,
ResNet34, and WideResNet34-10 models trained on CIFAR10, CIFAR100, and
Tiny-ImageNet datasets. Our experiments show that \method can significantly
improve both generalization and out-of-distribution robustness by around 1.5%
while maintaining or even slightly enhancing adversarial robustness. Code is
available at https://github.com/microsoft/robustlearn.
Related papers
- Data-Driven Lipschitz Continuity: A Cost-Effective Approach to Improve Adversarial Robustness [47.9744734181236]
We explore the concept of Lipschitz continuity to certify the robustness of deep neural networks (DNNs) against adversarial attacks.
We propose a novel algorithm that remaps the input domain into a constrained range, reducing the Lipschitz constant and potentially enhancing robustness.
Our method achieves the best robust accuracy for CIFAR10, CIFAR100, and ImageNet datasets on the RobustBench leaderboard.
arXiv Detail & Related papers (2024-06-28T03:10:36Z) - FullLoRA-AT: Efficiently Boosting the Robustness of Pretrained Vision
Transformers [61.48709409150777]
Vision Transformer (ViT) model has gradually become mainstream in various computer vision tasks.
Existing large models tend to prioritize performance during training, potentially neglecting the robustness.
We develop a novel LNLoRA module, incorporating a learnable layer normalization before the conventional LoRA module.
We propose the FullLoRA-AT framework by integrating the learnable LNLoRA modules into all key components of ViT-based models.
arXiv Detail & Related papers (2024-01-03T14:08:39Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - Bayesian Learning with Information Gain Provably Bounds Risk for a
Robust Adversarial Defense [27.545466364906773]
We present a new algorithm to learn a deep neural network model robust against adversarial attacks.
Our model demonstrate significantly improved robustness--up to 20%--compared with adversarial training and Adv-BNN under PGD attacks.
arXiv Detail & Related papers (2022-12-05T03:26:08Z) - Enhancing Adversarial Training with Second-Order Statistics of Weights [23.90998469971413]
We show that treating model weights as random variables allows for enhancing adversarial training through textbfSecond-Order textbfStatistics textbfOptimization.
We conduct an extensive set of experiments, which show that S$2$O not only improves the robustness and generalization of the trained neural networks when used in isolation, but also integrates easily in state-of-the-art adversarial training techniques.
arXiv Detail & Related papers (2022-03-11T15:40:57Z) - Interpolated Joint Space Adversarial Training for Robust and
Generalizable Defenses [82.3052187788609]
Adversarial training (AT) is considered to be one of the most reliable defenses against adversarial attacks.
Recent works show generalization improvement with adversarial samples under novel threat models.
We propose a novel threat model called Joint Space Threat Model (JSTM)
Under JSTM, we develop novel adversarial attacks and defenses.
arXiv Detail & Related papers (2021-12-12T21:08:14Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Bridging the Gap Between Adversarial Robustness and Optimization Bias [28.56135898767349]
Adrial robustness is an open challenge in deep learning, most often tackled using adversarial training.
We show that it is possible to achieve both perfect standard accuracy and a certain degree of robustness without a trade-off.
In particular, we characterize the robustness of linear convolutional models, showing that they resist attacks subject to a constraint on the Fourier-$ell_infty$ norm.
arXiv Detail & Related papers (2021-02-17T16:58:04Z) - A Simple Fine-tuning Is All You Need: Towards Robust Deep Learning Via
Adversarial Fine-tuning [90.44219200633286]
We propose a simple yet very effective adversarial fine-tuning approach based on a $textitslow start, fast decay$ learning rate scheduling strategy.
Experimental results show that the proposed adversarial fine-tuning approach outperforms the state-of-the-art methods on CIFAR-10, CIFAR-100 and ImageNet datasets.
arXiv Detail & Related papers (2020-12-25T20:50:15Z) - Do Wider Neural Networks Really Help Adversarial Robustness? [92.8311752980399]
We show that the model robustness is closely related to the tradeoff between natural accuracy and perturbation stability.
We propose a new Width Adjusted Regularization (WAR) method that adaptively enlarges $lambda$ on wide models.
arXiv Detail & Related papers (2020-10-03T04:46:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.