Related papers: Rethinking the Backward Propagation for Adversarial Transferability

Rethinking the Backward Propagation for Adversarial Transferability

URL: http://arxiv.org/abs/2306.12685v3
Date: Tue, 21 Nov 2023 02:57:09 GMT
Title: Rethinking the Backward Propagation for Adversarial Transferability
Authors: Xiaosen Wang, Kangheng Tong, Kun He
Abstract summary: Transfer-based attacks generate adversarial examples on the surrogate model, which can mislead other black-box models without access. In this work, we identify that non-linear layers truncate the gradient during backward propagation, making the gradient w.r.t. input image imprecise to the loss function. We propose a novel method to increase the relevance between the gradient w.r.t. input image and loss function so as to generate adversarial examples with higher transferability.
Score: 12.244490573612286
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transfer-based attacks generate adversarial examples on the surrogate model, which can mislead other black-box models without access, making it promising to attack real-world applications. Recently, several works have been proposed to boost adversarial transferability, in which the surrogate model is usually overlooked. In this work, we identify that non-linear layers (e.g., ReLU, max-pooling, etc.) truncate the gradient during backward propagation, making the gradient w.r.t. input image imprecise to the loss function. We hypothesize and empirically validate that such truncation undermines the transferability of adversarial examples. Based on these findings, we propose a novel method called Backward Propagation Attack (BPA) to increase the relevance between the gradient w.r.t. input image and loss function so as to generate adversarial examples with higher transferability. Specifically, BPA adopts a non-monotonic function as the derivative of ReLU and incorporates softmax with temperature to smooth the derivative of max-pooling, thereby mitigating the information loss during the backward propagation of gradients. Empirical results on the ImageNet dataset demonstrate that not only does our method substantially boost the adversarial transferability, but it is also general to existing transfer-based attacks. Code is available at https://github.com/Trustworthy-AI-Group/RPA.

Related papers

Towards Model Resistant to Transferable Adversarial Examples via Trigger Activation [95.3977252782181]
Adversarial examples, characterized by imperceptible perturbations, pose significant threats to deep neural networks by misleading their predictions. We introduce a novel training paradigm aimed at enhancing robustness against transferable adversarial examples (TAEs) in a more efficient and effective way.
arXiv Detail & Related papers (2025-04-20T09:07:10Z)
Advancing Generalized Transfer Attack with Initialization Derived Bilevel Optimization and Dynamic Sequence Truncation [49.480978190805125]
Transfer attacks generate significant interest for black-box applications. Existing works essentially directly optimize the single-level objective w.r.t. surrogate model. We propose a bilevel optimization paradigm, which explicitly reforms the nested relationship between the Upper-Level (UL) pseudo-victim attacker and the Lower-Level (LL) surrogate attacker.
arXiv Detail & Related papers (2024-06-04T07:45:27Z)
LRS: Enhancing Adversarial Transferability through Lipschitz Regularized Surrogate [8.248964912483912]
The transferability of adversarial examples is of central importance to transfer-based black-box adversarial attacks. We propose Lipschitz Regularized Surrogate (LRS) for transfer-based black-box attacks. We evaluate our proposed LRS approach by attacking state-of-the-art standard deep neural networks and defense models.
arXiv Detail & Related papers (2023-12-20T15:37:50Z)
Sampling-based Fast Gradient Rescaling Method for Highly Transferable Adversarial Attacks [18.05924632169541]
We propose a Sampling-based Fast Gradient Rescaling Method (S-FGRM) Specifically, we use data rescaling to substitute the sign function without extra computational cost. Our method could significantly boost the transferability of gradient-based attacks and outperform the state-of-the-art baselines.
arXiv Detail & Related papers (2023-07-06T07:52:42Z)
Boosting Adversarial Transferability by Achieving Flat Local Maxima [23.91315978193527]
Recently, various adversarial attacks have emerged to boost adversarial transferability from different perspectives. In this work, we assume and empirically validate that adversarial examples at a flat local region tend to have good transferability. We propose an approximation optimization method to simplify the gradient update of the objective function.
arXiv Detail & Related papers (2023-06-08T14:21:02Z)
Logit Margin Matters: Improving Transferable Targeted Adversarial Attack by Logit Calibration [85.71545080119026]
Cross-Entropy (CE) loss function is insufficient to learn transferable targeted adversarial examples. We propose two simple and effective logit calibration methods, which are achieved by downscaling the logits with a temperature factor and an adaptive margin. Experiments conducted on the ImageNet dataset validate the effectiveness of the proposed methods.
arXiv Detail & Related papers (2023-03-07T06:42:52Z)
Sampling-based Fast Gradient Rescaling Method for Highly Transferable Adversarial Attacks [19.917677500613788]
gradient-based approaches generally use the $sign$ function to generate perturbations at the end of the process. We propose a Sampling-based Fast Gradient Rescaling Method (S-FGRM) to improve the transferability of crafted adversarial examples.
arXiv Detail & Related papers (2022-04-06T15:12:20Z)
Learning to Learn Transferable Attack [77.67399621530052]
Transfer adversarial attack is a non-trivial black-box adversarial attack that aims to craft adversarial perturbations on the surrogate model and then apply such perturbations to the victim model. We propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation. Empirical results on the widely-used dataset demonstrate the effectiveness of our attack method with a 12.85% higher success rate of transfer attack compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-10T07:24:21Z)
Adaptive Perturbation for Adversarial Attack [50.77612889697216]
We propose a new gradient-based attack method for adversarial examples. We use the exact gradient direction with a scaling factor for generating adversarial perturbations. Our method exhibits higher transferability and outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2021-11-27T07:57:41Z)
Staircase Sign Method for Boosting Adversarial Attacks [123.19227129979943]
Crafting adversarial examples for the transfer-based attack is challenging and remains a research hot spot. We propose a novel Staircase Sign Method (S$2$M) to alleviate this issue, thus boosting transfer-based attacks. Our method can be generally integrated into any transfer-based attacks, and the computational overhead is negligible.
arXiv Detail & Related papers (2021-04-20T02:31:55Z)
Boosting Gradient for White-Box Adversarial Attacks [60.422511092730026]
We propose a universal adversarial example generation method, called ADV-ReLU, to enhance the performance of gradient based white-box attack algorithms. Our approach calculates the gradient of the loss function versus network input, maps the values to scores, and selects a part of them to update the misleading gradients.
arXiv Detail & Related papers (2020-10-21T02:13:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.