StyLess: Boosting the Transferability of Adversarial Examples
- URL: http://arxiv.org/abs/2304.11579v1
- Date: Sun, 23 Apr 2023 08:23:48 GMT
- Title: StyLess: Boosting the Transferability of Adversarial Examples
- Authors: Kaisheng Liang, Bin Xiao
- Abstract summary: Adversarial attacks can mislead deep neural networks (DNNs) by adding imperceptible perturbations to benign examples.
We propose a novel attack method called style-less perturbation (StyLess) to improve attack transferability.
- Score: 10.607781970035083
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial attacks can mislead deep neural networks (DNNs) by adding
imperceptible perturbations to benign examples. The attack transferability
enables adversarial examples to attack black-box DNNs with unknown
architectures or parameters, which poses threats to many real-world
applications. We find that existing transferable attacks do not distinguish
between style and content features during optimization, limiting their attack
transferability. To improve attack transferability, we propose a novel attack
method called style-less perturbation (StyLess). Specifically, instead of using
a vanilla network as the surrogate model, we advocate using stylized networks,
which encode different style features by perturbing an adaptive instance
normalization. Our method can prevent adversarial examples from using
non-robust style features and help generate transferable perturbations.
Comprehensive experiments show that our method can significantly improve the
transferability of adversarial examples. Furthermore, our approach is generic
and can outperform state-of-the-art transferable attacks when combined with
other attack techniques.
Related papers
- Efficient Generation of Targeted and Transferable Adversarial Examples for Vision-Language Models Via Diffusion Models [17.958154849014576]
Adversarial attacks can be used to assess the robustness of large visual-language models (VLMs)
Previous transfer-based adversarial attacks incur high costs due to high iteration counts and complex method structure.
We propose AdvDiffVLM, which uses diffusion models to generate natural, unrestricted and targeted adversarial examples.
arXiv Detail & Related papers (2024-04-16T07:19:52Z) - Bag of Tricks to Boost Adversarial Transferability [5.803095119348021]
adversarial examples generated under the white-box setting often exhibit low transferability across different models.
In this work, we find that several tiny changes in the existing adversarial attacks can significantly affect the attack performance.
Based on careful studies of existing adversarial attacks, we propose a bag of tricks to enhance adversarial transferability.
arXiv Detail & Related papers (2024-01-16T17:42:36Z) - Generalizable Black-Box Adversarial Attack with Meta Learning [54.196613395045595]
In black-box adversarial attack, the target model's parameters are unknown, and the attacker aims to find a successful perturbation based on query feedback under a query budget.
We propose to utilize the feedback information across historical attacks, dubbed example-level adversarial transferability.
The proposed framework with the two types of adversarial transferability can be naturally combined with any off-the-shelf query-based attack methods to boost their performance.
arXiv Detail & Related papers (2023-01-01T07:24:12Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - Boosting the Transferability of Adversarial Attacks with Reverse
Adversarial Perturbation [32.81400759291457]
adversarial examples can produce erroneous predictions by injecting imperceptible perturbations.
In this work, we study the transferability of adversarial examples, which is significant due to its threat to real-world applications.
We propose a novel attack method, dubbed reverse adversarial perturbation (RAP)
arXiv Detail & Related papers (2022-10-12T07:17:33Z) - On Trace of PGD-Like Adversarial Attacks [77.75152218980605]
Adversarial attacks pose safety and security concerns for deep learning applications.
We construct Adrial Response Characteristics (ARC) features to reflect the model's gradient consistency.
Our method is intuitive, light-weighted, non-intrusive, and data-undemanding.
arXiv Detail & Related papers (2022-05-19T14:26:50Z) - Towards Transferable Adversarial Attacks on Vision Transformers [110.55845478440807]
Vision transformers (ViTs) have demonstrated impressive performance on a series of computer vision tasks, yet they still suffer from adversarial examples.
We introduce a dual attack framework, which contains a Pay No Attention (PNA) attack and a PatchOut attack, to improve the transferability of adversarial samples across different ViTs.
arXiv Detail & Related papers (2021-09-09T11:28:25Z) - Towards Defending against Adversarial Examples via Attack-Invariant
Features [147.85346057241605]
Deep neural networks (DNNs) are vulnerable to adversarial noise.
adversarial robustness can be improved by exploiting adversarial examples.
Models trained on seen types of adversarial examples generally cannot generalize well to unseen types of adversarial examples.
arXiv Detail & Related papers (2021-06-09T12:49:54Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z) - Towards Feature Space Adversarial Attack [18.874224858723494]
We propose a new adversarial attack to Deep Neural Networks for image classification.
Our attack focuses on perturbing abstract features, more specifically, features that denote styles.
We show that our attack can generate adversarial samples that are more natural-looking than the state-of-the-art attacks.
arXiv Detail & Related papers (2020-04-26T13:56:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.