Improving Adversarial Transferability via Intermediate-level
Perturbation Decay
- URL: http://arxiv.org/abs/2304.13410v3
- Date: Thu, 2 Nov 2023 15:19:37 GMT
- Title: Improving Adversarial Transferability via Intermediate-level
Perturbation Decay
- Authors: Qizhang Li, Yiwen Guo, Wangmeng Zuo, Hao Chen
- Abstract summary: We develop a novel intermediate-level method that crafts adversarial examples within a single stage of optimization.
Experimental results show that it outperforms state-of-the-arts by large margins in attacking various victim models.
- Score: 79.07074710460012
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Intermediate-level attacks that attempt to perturb feature representations
following an adversarial direction drastically have shown favorable performance
in crafting transferable adversarial examples. Existing methods in this
category are normally formulated with two separate stages, where a directional
guide is required to be determined at first and the scalar projection of the
intermediate-level perturbation onto the directional guide is enlarged
thereafter. The obtained perturbation deviates from the guide inevitably in the
feature space, and it is revealed in this paper that such a deviation may lead
to sub-optimal attack. To address this issue, we develop a novel
intermediate-level method that crafts adversarial examples within a single
stage of optimization. In particular, the proposed method, named
intermediate-level perturbation decay (ILPD), encourages the intermediate-level
perturbation to be in an effective adversarial direction and to possess a great
magnitude simultaneously. In-depth discussion verifies the effectiveness of our
method. Experimental results show that it outperforms state-of-the-arts by
large margins in attacking various victim models on ImageNet (+10.07% on
average) and CIFAR-10 (+3.88% on average). Our code is at
https://github.com/qizhangli/ILPD-attack.
Related papers
- Boosting Imperceptibility of Stable Diffusion-based Adversarial Examples Generation with Momentum [13.305800254250789]
We propose a novel framework, Stable Diffusion-based Momentum Integrated Adversarial Examples (SD-MIAE)
It generates adversarial examples that can effectively mislead neural network classifiers while maintaining visual imperceptibility and preserving the semantic similarity to the original class label.
Experimental results demonstrate that SD-MIAE achieves a high misclassification rate of 79%, improving by 35% over the state-of-the-art method.
arXiv Detail & Related papers (2024-10-17T01:22:11Z) - Improving Transferable Targeted Adversarial Attack via Normalized Logit Calibration and Truncated Feature Mixing [26.159434438078968]
We propose two techniques for improving the targeted transferability from the loss and feature aspects.
In previous approaches, logit calibrations primarily focus on the logit margin between the targeted class and the untargeted classes among samples.
We introduce a new normalized logit calibration method that jointly considers the logit margin and the standard deviation of logits.
arXiv Detail & Related papers (2024-05-10T09:13:57Z) - An Intermediate-level Attack Framework on The Basis of Linear Regression [89.85593878754571]
This paper substantially extends our work published at ECCV, in which an intermediate-level attack was proposed to improve the transferability of some baseline adversarial examples.
We advocate to establish a direct linear mapping from the intermediate-level discrepancies (between adversarial features and benign features) to classification prediction loss of the adversarial example.
We show that 1) a variety of linear regression models can all be considered in order to establish the mapping, 2) the magnitude of the finally obtained intermediate-level discrepancy is linearly correlated with adversarial transferability, and 3) further boost of the performance can be achieved by performing multiple runs of the baseline attack with
arXiv Detail & Related papers (2022-03-21T03:54:53Z) - Adaptive Perturbation for Adversarial Attack [50.77612889697216]
We propose a new gradient-based attack method for adversarial examples.
We use the exact gradient direction with a scaling factor for generating adversarial perturbations.
Our method exhibits higher transferability and outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2021-11-27T07:57:41Z) - Exploring Transferable and Robust Adversarial Perturbation Generation
from the Perspective of Network Hierarchy [52.153866313879924]
The transferability and robustness of adversarial examples are two practical yet important properties for black-box adversarial attacks.
We propose a transferable and robust adversarial generation (TRAP) method.
Our TRAP achieves impressive transferability and high robustness against certain interferences.
arXiv Detail & Related papers (2021-08-16T11:52:41Z) - Boosting Adversarial Transferability through Enhanced Momentum [50.248076722464184]
Deep learning models are vulnerable to adversarial examples crafted by adding human-imperceptible perturbations on benign images.
Various momentum iterative gradient-based methods are shown to be effective to improve the adversarial transferability.
We propose an enhanced momentum iterative gradient-based method to further enhance the adversarial transferability.
arXiv Detail & Related papers (2021-03-19T03:10:32Z) - Hard-label Manifolds: Unexpected Advantages of Query Efficiency for
Finding On-manifold Adversarial Examples [67.23103682776049]
Recent zeroth order hard-label attacks on image classification models have shown comparable performance to their first-order, gradient-level alternatives.
It was recently shown in the gradient-level setting that regular adversarial examples leave the data manifold, while their on-manifold counterparts are in fact generalization errors.
We propose an information-theoretic argument based on a noisy manifold distance oracle, which leaks manifold information through the adversary's gradient estimate.
arXiv Detail & Related papers (2021-03-04T20:53:06Z) - Yet Another Intermediate-Level Attack [31.055720988792416]
The transferability of adversarial examples across deep neural network (DNN) models is the crux of a spectrum of black-box attacks.
We propose a novel method to enhance the black-box transferability of baseline adversarial examples.
arXiv Detail & Related papers (2020-08-20T09:14:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.