Adaptive Modeling Against Adversarial Attacks
- URL: http://arxiv.org/abs/2112.12431v1
- Date: Thu, 23 Dec 2021 09:52:30 GMT
- Title: Adaptive Modeling Against Adversarial Attacks
- Authors: Zhiwen Yan, Teck Khim Ng
- Abstract summary: Adversarial training, the process of training a deep learning model with adversarial data, is one of the most successful adversarial defense methods for deep learning models.
We have found that the robustness to white-box attack of an adversarially trained model can be further improved if we fine tune this model in inference stage to adapt to the adversarial input.
- Score: 1.90365714903665
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training, the process of training a deep learning model with
adversarial data, is one of the most successful adversarial defense methods for
deep learning models. We have found that the robustness to white-box attack of
an adversarially trained model can be further improved if we fine tune this
model in inference stage to adapt to the adversarial input, with the extra
information in it. We introduce an algorithm that "post trains" the model at
inference stage between the original output class and a "neighbor" class, with
existing training data. The accuracy of pre-trained Fast-FGSM CIFAR10
classifier base model against white-box projected gradient attack (PGD) can be
significantly improved from 46.8% to 64.5% with our algorithm.
Related papers
- Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness [52.9493817508055]
We propose Pre-trained Model Guided Adversarial Fine-Tuning (PMG-AFT) to enhance the model's zero-shot adversarial robustness.
Our approach consistently improves clean accuracy by an average of 8.72%.
arXiv Detail & Related papers (2024-01-09T04:33:03Z) - Fast Propagation is Better: Accelerating Single-Step Adversarial
Training via Sampling Subnetworks [69.54774045493227]
A drawback of adversarial training is the computational overhead introduced by the generation of adversarial examples.
We propose to exploit the interior building blocks of the model to improve efficiency.
Compared with previous methods, our method not only reduces the training cost but also achieves better model robustness.
arXiv Detail & Related papers (2023-10-24T01:36:20Z) - Boosting Model Inversion Attacks with Adversarial Examples [26.904051413441316]
We propose a new training paradigm for a learning-based model inversion attack that can achieve higher attack accuracy in a black-box setting.
First, we regularize the training process of the attack model with an added semantic loss function.
Second, we inject adversarial examples into the training data to increase the diversity of the class-related parts.
arXiv Detail & Related papers (2023-06-24T13:40:58Z) - Delving into Data: Effectively Substitute Training for Black-box Attack [84.85798059317963]
We propose a novel perspective substitute training that focuses on designing the distribution of data used in the knowledge stealing process.
The combination of these two modules can further boost the consistency of the substitute model and target model, which greatly improves the effectiveness of adversarial attack.
arXiv Detail & Related papers (2021-04-26T07:26:29Z) - Lagrangian Objective Function Leads to Improved Unforeseen Attack
Generalization in Adversarial Training [0.0]
Adversarial training (AT) has been shown effective to reach a robust model against the attack that is used during training.
We propose a simple modification to the AT that mitigates the mentioned issue.
We show that our attack is faster than other attack schemes that are designed for unseen attack generalization.
arXiv Detail & Related papers (2021-03-29T07:23:46Z) - Voting based ensemble improves robustness of defensive models [82.70303474487105]
We study whether it is possible to create an ensemble to further improve robustness.
By ensembling several state-of-the-art pre-trained defense models, our method can achieve a 59.8% robust accuracy.
arXiv Detail & Related papers (2020-11-28T00:08:45Z) - Deep Ensembles for Low-Data Transfer Learning [21.578470914935938]
We study different ways of creating ensembles from pre-trained models.
We show that the nature of pre-training itself is a performant source of diversity.
We propose a practical algorithm that efficiently identifies a subset of pre-trained models for any downstream dataset.
arXiv Detail & Related papers (2020-10-14T07:59:00Z) - Adversarial Concurrent Training: Optimizing Robustness and Accuracy
Trade-off of Deep Neural Networks [13.041607703862724]
We propose Adversarial Concurrent Training (ACT) to train a robust model in conjunction with a natural model in a minimax game.
ACT achieves 68.20% standard accuracy and 44.29% robustness accuracy under a 100-iteration untargeted attack.
arXiv Detail & Related papers (2020-08-16T22:14:48Z) - DaST: Data-free Substitute Training for Adversarial Attacks [55.76371274622313]
We propose a data-free substitute training method (DaST) to obtain substitute models for adversarial black-box attacks.
To achieve this, DaST utilizes specially designed generative adversarial networks (GANs) to train the substitute models.
Experiments demonstrate the substitute models can achieve competitive performance compared with the baseline models.
arXiv Detail & Related papers (2020-03-28T04:28:13Z) - Fast is better than free: Revisiting adversarial training [86.11788847990783]
We show that it is possible to train empirically robust models using a much weaker and cheaper adversary.
We identify a failure mode referred to as "catastrophic overfitting" which may have caused previous attempts to use FGSM adversarial training to fail.
arXiv Detail & Related papers (2020-01-12T20:30:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.