Boosting the Adversarial Transferability of Surrogate Models with Dark
Knowledge
- URL: http://arxiv.org/abs/2206.08316v2
- Date: Tue, 5 Sep 2023 05:33:46 GMT
- Title: Boosting the Adversarial Transferability of Surrogate Models with Dark
Knowledge
- Authors: Dingcheng Yang, Zihao Xiao, Wenjian Yu
- Abstract summary: Deep neural networks (DNNs) are vulnerable to adversarial examples.
adversarial examples have transferability, which means that an adversarial example for a DNN model can fool another model with a non-trivial probability.
This paper proposes a method for training a surrogate model with dark knowledge to boost the transferability of the adversarial examples generated by the surrogate model.
- Score: 5.702679709305404
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNNs) are vulnerable to adversarial examples. And, the
adversarial examples have transferability, which means that an adversarial
example for a DNN model can fool another model with a non-trivial probability.
This gave birth to the transfer-based attack where the adversarial examples
generated by a surrogate model are used to conduct black-box attacks. There are
some work on generating the adversarial examples from a given surrogate model
with better transferability. However, training a special surrogate model to
generate adversarial examples with better transferability is relatively
under-explored. This paper proposes a method for training a surrogate model
with dark knowledge to boost the transferability of the adversarial examples
generated by the surrogate model. This trained surrogate model is named dark
surrogate model (DSM). The proposed method for training a DSM consists of two
key components: a teacher model extracting dark knowledge, and the mixing
augmentation skill enhancing dark knowledge of training data. We conducted
extensive experiments to show that the proposed method can substantially
improve the adversarial transferability of surrogate models across different
architectures of surrogate models and optimizers for generating adversarial
examples, and it can be applied to other scenarios of transfer-based attack
that contain dark knowledge, like face verification. Our code is publicly
available at \url{https://github.com/ydc123/Dark_Surrogate_Model}.
Related papers
- Common Knowledge Learning for Generating Transferable Adversarial
Examples [60.1287733223249]
This paper focuses on an important type of black-box attacks, where the adversary generates adversarial examples by a substitute (source) model.
Existing methods tend to give unsatisfactory adversarial transferability when the source and target models are from different types of DNN architectures.
We propose a common knowledge learning (CKL) framework to learn better network weights to generate adversarial examples.
arXiv Detail & Related papers (2023-07-01T09:07:12Z) - Generating Adversarial Examples with Better Transferability via Masking
Unimportant Parameters of Surrogate Model [6.737574282249396]
We propose to improve the transferability of adversarial examples in the transfer-based attack via unimportant masking parameters (MUP)
The key idea in MUP is to refine the pretrained surrogate models to boost the transfer-based attack.
arXiv Detail & Related papers (2023-04-14T03:06:43Z) - Towards Understanding and Boosting Adversarial Transferability from a
Distribution Perspective [80.02256726279451]
adversarial attacks against Deep neural networks (DNNs) have received broad attention in recent years.
We propose a novel method that crafts adversarial examples by manipulating the distribution of the image.
Our method can significantly improve the transferability of the crafted attacks and achieves state-of-the-art performance in both untargeted and targeted scenarios.
arXiv Detail & Related papers (2022-10-09T09:58:51Z) - Training Meta-Surrogate Model for Transferable Adversarial Attack [98.13178217557193]
We consider adversarial attacks to a black-box model when no queries are allowed.
In this setting, many methods directly attack surrogate models and transfer the obtained adversarial examples to fool the target model.
We show we can obtain a Meta-Surrogate Model (MSM) such that attacks to this model can be easier transferred to other models.
arXiv Detail & Related papers (2021-09-05T03:27:46Z) - On the Transferability of Adversarial Attacksagainst Neural Text
Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models.
We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models.
We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z) - Two Sides of the Same Coin: White-box and Black-box Attacks for Transfer
Learning [60.784641458579124]
We show that fine-tuning effectively enhances model robustness under white-box FGSM attacks.
We also propose a black-box attack method for transfer learning models which attacks the target model with the adversarial examples produced by its source model.
To systematically measure the effect of both white-box and black-box attacks, we propose a new metric to evaluate how transferable are the adversarial examples produced by a source model to a target model.
arXiv Detail & Related papers (2020-08-25T15:04:32Z) - Boosting Black-Box Attack with Partially Transferred Conditional
Adversarial Distribution [83.02632136860976]
We study black-box adversarial attacks against deep neural networks (DNNs)
We develop a novel mechanism of adversarial transferability, which is robust to the surrogate biases.
Experiments on benchmark datasets and attacking against real-world API demonstrate the superior attack performance of the proposed method.
arXiv Detail & Related papers (2020-06-15T16:45:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.