Adversarial Example Soups: Improving Transferability and Stealthiness for Free
- URL: http://arxiv.org/abs/2402.18370v3
- Date: Wed, 02 Apr 2025 05:07:09 GMT
- Title: Adversarial Example Soups: Improving Transferability and Stealthiness for Free
- Authors: Bo Yang, Hengwei Zhang, Jindong Wang, Yulong Yang, Chenhao Lin, Chao Shen, Zhengyu Zhao,
- Abstract summary: A conventional recipe for maximizing transferability is to keep only the optimal adversarial example from all those obtained in the optimization pipeline.<n>We propose Adversarial Example Soups'' (AES), with AES-tune for averaging discarded adversarial examples.<n>Our AES boosts 10 state-of-the-art transfer attacks and their combinations by up to 13% against 10 diverse (defensive) target models.
- Score: 17.094999396412216
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transferable adversarial examples cause practical security risks since they can mislead a target model without knowing its internal knowledge. A conventional recipe for maximizing transferability is to keep only the optimal adversarial example from all those obtained in the optimization pipeline. In this paper, for the first time, we revisit this convention and demonstrate that those discarded, sub-optimal adversarial examples can be reused to boost transferability. Specifically, we propose ``Adversarial Example Soups'' (AES), with AES-tune for averaging discarded adversarial examples in hyperparameter tuning and AES-rand for stability testing. In addition, our AES is inspired by ``model soups'', which averages weights of multiple fine-tuned models for improved accuracy without increasing inference time. Extensive experiments validate the global effectiveness of our AES, boosting 10 state-of-the-art transfer attacks and their combinations by up to 13\% against 10 diverse (defensive) target models. We also show the possibility of generalizing AES to other types, \textit{e.g.}, directly averaging multiple in-the-wild adversarial examples that yield comparable success. A promising byproduct of AES is the improved stealthiness of adversarial examples since the perturbation variances are naturally reduced.
Related papers
- Boosting the Targeted Transferability of Adversarial Examples via Salient Region & Weighted Feature Drop [2.176586063731861]
A prevalent approach for adversarial attacks relies on the transferability of adversarial examples.
A novel framework based on Salient region & Weighted Feature Drop (SWFD) designed to enhance the targeted transferability of adversarial examples.
arXiv Detail & Related papers (2024-11-11T08:23:37Z) - Transferable Adversarial Attacks on SAM and Its Downstream Models [87.23908485521439]
This paper explores the feasibility of adversarial attacking various downstream models fine-tuned from the segment anything model (SAM)
To enhance the effectiveness of the adversarial attack towards models fine-tuned on unknown datasets, we propose a universal meta-initialization (UMI) algorithm.
arXiv Detail & Related papers (2024-10-26T15:04:04Z) - Improving Adversarial Training using Vulnerability-Aware Perturbation
Budget [7.430861908931903]
Adversarial Training (AT) effectively improves the robustness of Deep Neural Networks (DNNs) to adversarial attacks.
We propose two simple, computationally cheap vulnerability-aware reweighting functions for assigning perturbation bounds to adversarial examples used for AT.
Experimental results show that the proposed methods yield genuine improvements in the robustness of AT algorithms against various adversarial attacks.
arXiv Detail & Related papers (2024-03-06T21:50:52Z) - Improving Transferability of Adversarial Examples via Bayesian Attacks [84.90830931076901]
We introduce a novel extension by incorporating the Bayesian formulation into the model input as well, enabling the joint diversification of both the model input and model parameters.
Our method achieves a new state-of-the-art on transfer-based attacks, improving the average success rate on ImageNet and CIFAR-10 by 19.14% and 2.08%, respectively.
arXiv Detail & Related papers (2023-07-21T03:43:07Z) - Generating Adversarial Examples with Better Transferability via Masking
Unimportant Parameters of Surrogate Model [6.737574282249396]
We propose to improve the transferability of adversarial examples in the transfer-based attack via unimportant masking parameters (MUP)
The key idea in MUP is to refine the pretrained surrogate models to boost the transfer-based attack.
arXiv Detail & Related papers (2023-04-14T03:06:43Z) - Latent Feature Relation Consistency for Adversarial Robustness [80.24334635105829]
misclassification will occur when deep neural networks predict adversarial examples which add human-imperceptible adversarial noise to natural examples.
We propose textbfLatent textbfFeature textbfRelation textbfConsistency (textbfLFRC)
LFRC constrains the relation of adversarial examples in latent space to be consistent with the natural examples.
arXiv Detail & Related papers (2023-03-29T13:50:01Z) - Making Substitute Models More Bayesian Can Enhance Transferability of
Adversarial Examples [89.85593878754571]
transferability of adversarial examples across deep neural networks is the crux of many black-box attacks.
We advocate to attack a Bayesian model for achieving desirable transferability.
Our method outperforms recent state-of-the-arts by large margins.
arXiv Detail & Related papers (2023-02-10T07:08:13Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - Generalizing Adversarial Examples by AdaBelief Optimizer [6.243028964381449]
We propose an AdaBelief iterative Fast Gradient Sign Method to generalize adversarial examples.
Compared with state-of-the-art attack methods, our proposed method can generate adversarial examples effectively in the white-box setting.
The transfer rate is 7%-21% higher than latest attack methods.
arXiv Detail & Related papers (2021-01-25T07:39:16Z) - Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples.
Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks.
In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.