Rethinking Model Ensemble in Transfer-based Adversarial Attacks
- URL: http://arxiv.org/abs/2303.09105v2
- Date: Mon, 4 Mar 2024 11:30:06 GMT
- Title: Rethinking Model Ensemble in Transfer-based Adversarial Attacks
- Authors: Huanran Chen, Yichi Zhang, Yinpeng Dong, Xiao Yang, Hang Su, Jun Zhu
- Abstract summary: An effective strategy to improve the transferability is attacking an ensemble of models.
Previous works simply average the outputs of different models.
We propose a Common Weakness Attack (CWA) to generate more transferable adversarial examples.
- Score: 46.82830479910875
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is widely recognized that deep learning models lack robustness to
adversarial examples. An intriguing property of adversarial examples is that
they can transfer across different models, which enables black-box attacks
without any knowledge of the victim model. An effective strategy to improve the
transferability is attacking an ensemble of models. However, previous works
simply average the outputs of different models, lacking an in-depth analysis on
how and why model ensemble methods can strongly improve the transferability. In
this paper, we rethink the ensemble in adversarial attacks and define the
common weakness of model ensemble with two properties: 1) the flatness of loss
landscape; and 2) the closeness to the local optimum of each model. We
empirically and theoretically show that both properties are strongly correlated
with the transferability and propose a Common Weakness Attack (CWA) to generate
more transferable adversarial examples by promoting these two properties.
Experimental results on both image classification and object detection tasks
validate the effectiveness of our approach to improving the adversarial
transferability, especially when attacking adversarially trained models. We
also successfully apply our method to attack a black-box large vision-language
model -- Google's Bard, showing the practical effectiveness. Code is available
at \url{https://github.com/huanranchen/AdversarialAttacks}.
Related papers
- Scaling Laws for Black box Adversarial Attacks [37.744814957775965]
Adversarial examples exhibit cross-model transferability, enabling to attack black-box models.
Model ensembling is an effective strategy to improve the transferability by attacking multiple surrogate models simultaneously.
We show that scaled attacks bring better interpretability in semantics, indicating that the common features of models are captured.
arXiv Detail & Related papers (2024-11-25T08:14:37Z) - SA-Attack: Improving Adversarial Transferability of Vision-Language
Pre-training Models via Self-Augmentation [56.622250514119294]
In contrast to white-box adversarial attacks, transfer attacks are more reflective of real-world scenarios.
We propose a self-augment-based transfer attack method, termed SA-Attack.
arXiv Detail & Related papers (2023-12-08T09:08:50Z) - Enhancing Adversarial Attacks: The Similar Target Method [6.293148047652131]
adversarial examples pose a threat to deep neural networks' applications.
Deep neural networks are vulnerable to adversarial examples, posing a threat to the models' applications and raising security concerns.
We propose a similar targeted attack method named Similar Target(ST)
arXiv Detail & Related papers (2023-08-21T14:16:36Z) - An Adaptive Model Ensemble Adversarial Attack for Boosting Adversarial
Transferability [26.39964737311377]
We propose an adaptive ensemble attack, dubbed AdaEA, to adaptively control the fusion of the outputs from each model.
We achieve considerable improvement over the existing ensemble attacks on various datasets.
arXiv Detail & Related papers (2023-08-05T15:12:36Z) - Frequency Domain Model Augmentation for Adversarial Attack [91.36850162147678]
For black-box attacks, the gap between the substitute model and the victim model is usually large.
We propose a novel spectrum simulation attack to craft more transferable adversarial examples against both normally trained and defense models.
arXiv Detail & Related papers (2022-07-12T08:26:21Z) - Harnessing Perceptual Adversarial Patches for Crowd Counting [92.79051296850405]
Crowd counting is vulnerable to adversarial examples in the physical world.
This paper proposes the Perceptual Adrial Patch (PAP) generation framework to learn the shared perceptual features between models.
arXiv Detail & Related papers (2021-09-16T13:51:39Z) - Learning to Attack: Towards Textual Adversarial Attacking in Real-world
Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples.
We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z) - Two Sides of the Same Coin: White-box and Black-box Attacks for Transfer
Learning [60.784641458579124]
We show that fine-tuning effectively enhances model robustness under white-box FGSM attacks.
We also propose a black-box attack method for transfer learning models which attacks the target model with the adversarial examples produced by its source model.
To systematically measure the effect of both white-box and black-box attacks, we propose a new metric to evaluate how transferable are the adversarial examples produced by a source model to a target model.
arXiv Detail & Related papers (2020-08-25T15:04:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.