Rethinking the Backward Propagation for Adversarial Transferability
- URL: http://arxiv.org/abs/2306.12685v3
- Date: Tue, 21 Nov 2023 02:57:09 GMT
- Title: Rethinking the Backward Propagation for Adversarial Transferability
- Authors: Xiaosen Wang, Kangheng Tong, Kun He
- Abstract summary: Transfer-based attacks generate adversarial examples on the surrogate model, which can mislead other black-box models without access.
In this work, we identify that non-linear layers truncate the gradient during backward propagation, making the gradient w.r.t. input image imprecise to the loss function.
We propose a novel method to increase the relevance between the gradient w.r.t. input image and loss function so as to generate adversarial examples with higher transferability.
- Score: 12.244490573612286
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transfer-based attacks generate adversarial examples on the surrogate model,
which can mislead other black-box models without access, making it promising to
attack real-world applications. Recently, several works have been proposed to
boost adversarial transferability, in which the surrogate model is usually
overlooked. In this work, we identify that non-linear layers (e.g., ReLU,
max-pooling, etc.) truncate the gradient during backward propagation, making
the gradient w.r.t. input image imprecise to the loss function. We hypothesize
and empirically validate that such truncation undermines the transferability of
adversarial examples. Based on these findings, we propose a novel method called
Backward Propagation Attack (BPA) to increase the relevance between the
gradient w.r.t. input image and loss function so as to generate adversarial
examples with higher transferability. Specifically, BPA adopts a non-monotonic
function as the derivative of ReLU and incorporates softmax with temperature to
smooth the derivative of max-pooling, thereby mitigating the information loss
during the backward propagation of gradients. Empirical results on the ImageNet
dataset demonstrate that not only does our method substantially boost the
adversarial transferability, but it is also general to existing transfer-based
attacks. Code is available at https://github.com/Trustworthy-AI-Group/RPA.
Related papers
- Advancing Generalized Transfer Attack with Initialization Derived Bilevel Optimization and Dynamic Sequence Truncation [49.480978190805125]
Transfer attacks generate significant interest for black-box applications.
Existing works essentially directly optimize the single-level objective w.r.t. surrogate model.
We propose a bilevel optimization paradigm, which explicitly reforms the nested relationship between the Upper-Level (UL) pseudo-victim attacker and the Lower-Level (LL) surrogate attacker.
arXiv Detail & Related papers (2024-06-04T07:45:27Z) - LRS: Enhancing Adversarial Transferability through Lipschitz Regularized
Surrogate [8.248964912483912]
The transferability of adversarial examples is of central importance to transfer-based black-box adversarial attacks.
We propose Lipschitz Regularized Surrogate (LRS) for transfer-based black-box attacks.
We evaluate our proposed LRS approach by attacking state-of-the-art standard deep neural networks and defense models.
arXiv Detail & Related papers (2023-12-20T15:37:50Z) - Sampling-based Fast Gradient Rescaling Method for Highly Transferable
Adversarial Attacks [18.05924632169541]
We propose a Sampling-based Fast Gradient Rescaling Method (S-FGRM)
Specifically, we use data rescaling to substitute the sign function without extra computational cost.
Our method could significantly boost the transferability of gradient-based attacks and outperform the state-of-the-art baselines.
arXiv Detail & Related papers (2023-07-06T07:52:42Z) - Boosting Adversarial Transferability by Achieving Flat Local Maxima [23.91315978193527]
Recently, various adversarial attacks have emerged to boost adversarial transferability from different perspectives.
In this work, we assume and empirically validate that adversarial examples at a flat local region tend to have good transferability.
We propose an approximation optimization method to simplify the gradient update of the objective function.
arXiv Detail & Related papers (2023-06-08T14:21:02Z) - Logit Margin Matters: Improving Transferable Targeted Adversarial Attack
by Logit Calibration [85.71545080119026]
Cross-Entropy (CE) loss function is insufficient to learn transferable targeted adversarial examples.
We propose two simple and effective logit calibration methods, which are achieved by downscaling the logits with a temperature factor and an adaptive margin.
Experiments conducted on the ImageNet dataset validate the effectiveness of the proposed methods.
arXiv Detail & Related papers (2023-03-07T06:42:52Z) - Sampling-based Fast Gradient Rescaling Method for Highly Transferable
Adversarial Attacks [19.917677500613788]
gradient-based approaches generally use the $sign$ function to generate perturbations at the end of the process.
We propose a Sampling-based Fast Gradient Rescaling Method (S-FGRM) to improve the transferability of crafted adversarial examples.
arXiv Detail & Related papers (2022-04-06T15:12:20Z) - Learning to Learn Transferable Attack [77.67399621530052]
Transfer adversarial attack is a non-trivial black-box adversarial attack that aims to craft adversarial perturbations on the surrogate model and then apply such perturbations to the victim model.
We propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation.
Empirical results on the widely-used dataset demonstrate the effectiveness of our attack method with a 12.85% higher success rate of transfer attack compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-10T07:24:21Z) - Adaptive Perturbation for Adversarial Attack [50.77612889697216]
We propose a new gradient-based attack method for adversarial examples.
We use the exact gradient direction with a scaling factor for generating adversarial perturbations.
Our method exhibits higher transferability and outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2021-11-27T07:57:41Z) - Staircase Sign Method for Boosting Adversarial Attacks [123.19227129979943]
Crafting adversarial examples for the transfer-based attack is challenging and remains a research hot spot.
We propose a novel Staircase Sign Method (S$2$M) to alleviate this issue, thus boosting transfer-based attacks.
Our method can be generally integrated into any transfer-based attacks, and the computational overhead is negligible.
arXiv Detail & Related papers (2021-04-20T02:31:55Z) - Boosting Gradient for White-Box Adversarial Attacks [60.422511092730026]
We propose a universal adversarial example generation method, called ADV-ReLU, to enhance the performance of gradient based white-box attack algorithms.
Our approach calculates the gradient of the loss function versus network input, maps the values to scores, and selects a part of them to update the misleading gradients.
arXiv Detail & Related papers (2020-10-21T02:13:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.