Why Does Little Robustness Help? Understanding and Improving Adversarial
Transferability from Surrogate Training
- URL: http://arxiv.org/abs/2307.07873v6
- Date: Fri, 1 Sep 2023 15:30:24 GMT
- Title: Why Does Little Robustness Help? Understanding and Improving Adversarial
Transferability from Surrogate Training
- Authors: Yechao Zhang, Shengshan Hu, Leo Yu Zhang, Junyu Shi, Minghui Li,
Xiaogeng Liu, Wei Wan, Hai Jin
- Abstract summary: Adversarial examples (AEs) for DNNs have been shown to be transferable.
In this paper, we take a further step towards understanding adversarial transferability.
- Score: 24.376314203167016
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial examples (AEs) for DNNs have been shown to be transferable: AEs
that successfully fool white-box surrogate models can also deceive other
black-box models with different architectures. Although a bunch of empirical
studies have provided guidance on generating highly transferable AEs, many of
these findings lack explanations and even lead to inconsistent advice. In this
paper, we take a further step towards understanding adversarial
transferability, with a particular focus on surrogate aspects. Starting from
the intriguing little robustness phenomenon, where models adversarially trained
with mildly perturbed adversarial samples can serve as better surrogates, we
attribute it to a trade-off between two predominant factors: model smoothness
and gradient similarity. Our investigations focus on their joint effects,
rather than their separate correlations with transferability. Through a series
of theoretical and empirical analyses, we conjecture that the data distribution
shift in adversarial training explains the degradation of gradient similarity.
Building on these insights, we explore the impacts of data augmentation and
gradient regularization on transferability and identify that the trade-off
generally exists in the various training mechanisms, thus building a
comprehensive blueprint for the regulation mechanism behind transferability.
Finally, we provide a general route for constructing better surrogates to boost
transferability which optimizes both model smoothness and gradient similarity
simultaneously, e.g., the combination of input gradient regularization and
sharpness-aware minimization (SAM), validated by extensive experiments. In
summary, we call for attention to the united impacts of these two factors for
launching effective transfer attacks, rather than optimizing one while ignoring
the other, and emphasize the crucial role of manipulating surrogate models.
Related papers
- SA-Attack: Improving Adversarial Transferability of Vision-Language
Pre-training Models via Self-Augmentation [56.622250514119294]
In contrast to white-box adversarial attacks, transfer attacks are more reflective of real-world scenarios.
We propose a self-augment-based transfer attack method, termed SA-Attack.
arXiv Detail & Related papers (2023-12-08T09:08:50Z) - Improving Adversarial Transferability by Stable Diffusion [36.97548018603747]
adversarial examples introduce imperceptible perturbations to benign samples, deceiving predictions.
Deep neural networks (DNNs) are susceptible to adversarial examples, which introduce imperceptible perturbations to benign samples, deceiving predictions.
We introduce a novel attack method called Stable Diffusion Attack Method (SDAM), which incorporates samples generated by Stable Diffusion to augment input images.
arXiv Detail & Related papers (2023-11-18T09:10:07Z) - An Adaptive Model Ensemble Adversarial Attack for Boosting Adversarial
Transferability [26.39964737311377]
We propose an adaptive ensemble attack, dubbed AdaEA, to adaptively control the fusion of the outputs from each model.
We achieve considerable improvement over the existing ensemble attacks on various datasets.
arXiv Detail & Related papers (2023-08-05T15:12:36Z) - Common Knowledge Learning for Generating Transferable Adversarial
Examples [60.1287733223249]
This paper focuses on an important type of black-box attacks, where the adversary generates adversarial examples by a substitute (source) model.
Existing methods tend to give unsatisfactory adversarial transferability when the source and target models are from different types of DNN architectures.
We propose a common knowledge learning (CKL) framework to learn better network weights to generate adversarial examples.
arXiv Detail & Related papers (2023-07-01T09:07:12Z) - Robust Transferable Feature Extractors: Learning to Defend Pre-Trained
Networks Against White Box Adversaries [69.53730499849023]
We show that adversarial examples can be successfully transferred to another independently trained model to induce prediction errors.
We propose a deep learning-based pre-processing mechanism, which we refer to as a robust transferable feature extractor (RTFE)
arXiv Detail & Related papers (2022-09-14T21:09:34Z) - Towards Robust and Adaptive Motion Forecasting: A Causal Representation
Perspective [72.55093886515824]
We introduce a causal formalism of motion forecasting, which casts the problem as a dynamic process with three groups of latent variables.
We devise a modular architecture that factorizes the representations of invariant mechanisms and style confounders to approximate a causal graph.
Experiment results on synthetic and real datasets show that our three proposed components significantly improve the robustness and reusability of the learned motion representations.
arXiv Detail & Related papers (2021-11-29T18:59:09Z) - Harnessing Perceptual Adversarial Patches for Crowd Counting [92.79051296850405]
Crowd counting is vulnerable to adversarial examples in the physical world.
This paper proposes the Perceptual Adrial Patch (PAP) generation framework to learn the shared perceptual features between models.
arXiv Detail & Related papers (2021-09-16T13:51:39Z) - Exploring Transferable and Robust Adversarial Perturbation Generation
from the Perspective of Network Hierarchy [52.153866313879924]
The transferability and robustness of adversarial examples are two practical yet important properties for black-box adversarial attacks.
We propose a transferable and robust adversarial generation (TRAP) method.
Our TRAP achieves impressive transferability and high robustness against certain interferences.
arXiv Detail & Related papers (2021-08-16T11:52:41Z) - TRS: Transferability Reduced Ensemble via Encouraging Gradient Diversity
and Model Smoothness [14.342349428248887]
Adversarial Transferability is an intriguing property of adversarial examples.
This paper theoretically analyzes sufficient conditions for transferability between models.
We propose a practical algorithm to reduce transferability within an ensemble to improve its robustness.
arXiv Detail & Related papers (2021-04-01T17:58:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.