Related papers: Boosting the Adversarial Transferability of Surrogate Models with Dark Knowledge

Boosting the Adversarial Transferability of Surrogate Models with Dark Knowledge

URL: http://arxiv.org/abs/2206.08316v2
Date: Tue, 5 Sep 2023 05:33:46 GMT
Title: Boosting the Adversarial Transferability of Surrogate Models with Dark Knowledge
Authors: Dingcheng Yang, Zihao Xiao, Wenjian Yu
Abstract summary: Deep neural networks (DNNs) are vulnerable to adversarial examples. adversarial examples have transferability, which means that an adversarial example for a DNN model can fool another model with a non-trivial probability. This paper proposes a method for training a surrogate model with dark knowledge to boost the transferability of the adversarial examples generated by the surrogate model.
Score: 5.702679709305404
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks (DNNs) are vulnerable to adversarial examples. And, the adversarial examples have transferability, which means that an adversarial example for a DNN model can fool another model with a non-trivial probability. This gave birth to the transfer-based attack where the adversarial examples generated by a surrogate model are used to conduct black-box attacks. There are some work on generating the adversarial examples from a given surrogate model with better transferability. However, training a special surrogate model to generate adversarial examples with better transferability is relatively under-explored. This paper proposes a method for training a surrogate model with dark knowledge to boost the transferability of the adversarial examples generated by the surrogate model. This trained surrogate model is named dark surrogate model (DSM). The proposed method for training a DSM consists of two key components: a teacher model extracting dark knowledge, and the mixing augmentation skill enhancing dark knowledge of training data. We conducted extensive experiments to show that the proposed method can substantially improve the adversarial transferability of surrogate models across different architectures of surrogate models and optimizers for generating adversarial examples, and it can be applied to other scenarios of transfer-based attack that contain dark knowledge, like face verification. Our code is publicly available at \url{https://github.com/ydc123/Dark_Surrogate_Model}.

Related papers

Common Knowledge Learning for Generating Transferable Adversarial Examples [60.1287733223249]
This paper focuses on an important type of black-box attacks, where the adversary generates adversarial examples by a substitute (source) model. Existing methods tend to give unsatisfactory adversarial transferability when the source and target models are from different types of DNN architectures. We propose a common knowledge learning (CKL) framework to learn better network weights to generate adversarial examples.
arXiv Detail & Related papers (2023-07-01T09:07:12Z)
Generating Adversarial Examples with Better Transferability via Masking Unimportant Parameters of Surrogate Model [6.737574282249396]
We propose to improve the transferability of adversarial examples in the transfer-based attack via unimportant masking parameters (MUP) The key idea in MUP is to refine the pretrained surrogate models to boost the transfer-based attack.
arXiv Detail & Related papers (2023-04-14T03:06:43Z)
Towards Understanding and Boosting Adversarial Transferability from a Distribution Perspective [80.02256726279451]
adversarial attacks against Deep neural networks (DNNs) have received broad attention in recent years. We propose a novel method that crafts adversarial examples by manipulating the distribution of the image. Our method can significantly improve the transferability of the crafted attacks and achieves state-of-the-art performance in both untargeted and targeted scenarios.
arXiv Detail & Related papers (2022-10-09T09:58:51Z)
Transferable Adversarial Examples with Bayes Approach [15.35252941167733]
Black-box adversarial attacks are one of the most heated topics in trustworthy AI. In this paper, we explore the transferability of adversarial examples via the lens of Bayesian approach. Experiments illustrate the significant effectiveness of BayAtk in crafting more transferable adversarial examples.
arXiv Detail & Related papers (2022-08-13T01:20:39Z)
Training Meta-Surrogate Model for Transferable Adversarial Attack [98.13178217557193]
We consider adversarial attacks to a black-box model when no queries are allowed. In this setting, many methods directly attack surrogate models and transfer the obtained adversarial examples to fool the target model. We show we can obtain a Meta-Surrogate Model (MSM) such that attacks to this model can be easier transferred to other models.
arXiv Detail & Related papers (2021-09-05T03:27:46Z)
On the Transferability of Adversarial Attacksagainst Neural Text Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models. We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models. We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z)
Two Sides of the Same Coin: White-box and Black-box Attacks for Transfer Learning [60.784641458579124]
We show that fine-tuning effectively enhances model robustness under white-box FGSM attacks. We also propose a black-box attack method for transfer learning models which attacks the target model with the adversarial examples produced by its source model. To systematically measure the effect of both white-box and black-box attacks, we propose a new metric to evaluate how transferable are the adversarial examples produced by a source model to a target model.
arXiv Detail & Related papers (2020-08-25T15:04:32Z)
Boosting Black-Box Attack with Partially Transferred Conditional Adversarial Distribution [83.02632136860976]
We study black-box adversarial attacks against deep neural networks (DNNs) We develop a novel mechanism of adversarial transferability, which is robust to the surrogate biases. Experiments on benchmark datasets and attacking against real-world API demonstrate the superior attack performance of the proposed method.
arXiv Detail & Related papers (2020-06-15T16:45:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.