Diversifying the High-level Features for better Adversarial
Transferability
- URL: http://arxiv.org/abs/2304.10136v2
- Date: Fri, 15 Sep 2023 02:14:34 GMT
- Title: Diversifying the High-level Features for better Adversarial
Transferability
- Authors: Zhiyuan Wang, Zeliang Zhang, Siyuan Liang, Xiaosen Wang
- Abstract summary: We propose diversifying the high-level features (DHF) for more transferable adversarial examples.
DHF perturbs the high-level features by randomly transforming the high-level features and mixing them with the feature of benign samples.
Empirical evaluations on ImageNet dataset show that DHF could effectively improve the transferability of existing momentum-based attacks.
- Score: 21.545976132427747
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Given the great threat of adversarial attacks against Deep Neural Networks
(DNNs), numerous works have been proposed to boost transferability to attack
real-world applications. However, existing attacks often utilize advanced
gradient calculation or input transformation but ignore the white-box model.
Inspired by the fact that DNNs are over-parameterized for superior performance,
we propose diversifying the high-level features (DHF) for more transferable
adversarial examples. In particular, DHF perturbs the high-level features by
randomly transforming the high-level features and mixing them with the feature
of benign samples when calculating the gradient at each iteration. Due to the
redundancy of parameters, such transformation does not affect the
classification performance but helps identify the invariant features across
different models, leading to much better transferability. Empirical evaluations
on ImageNet dataset show that DHF could effectively improve the transferability
of existing momentum-based attacks. Incorporated into the input
transformation-based attacks, DHF generates more transferable adversarial
examples and outperforms the baselines with a clear margin when attacking
several defense models, showing its generalization to various attacks and high
effectiveness for boosting transferability. Code is available at
https://github.com/Trustworthy-AI-Group/DHF.
Related papers
- Improving Transferable Targeted Attacks with Feature Tuning Mixup [12.707753562907534]
Deep neural networks exhibit vulnerability to examples that can transfer across different models.
We propose Feature Tuning Mixup (FTM) to enhance targeted attack transferability.
Our method achieves significant improvements over state-of-the-art methods while maintaining low computational cost.
arXiv Detail & Related papers (2024-11-23T13:18:25Z) - Boosting the Targeted Transferability of Adversarial Examples via Salient Region & Weighted Feature Drop [2.176586063731861]
A prevalent approach for adversarial attacks relies on the transferability of adversarial examples.
A novel framework based on Salient region & Weighted Feature Drop (SWFD) designed to enhance the targeted transferability of adversarial examples.
arXiv Detail & Related papers (2024-11-11T08:23:37Z) - Bag of Tricks to Boost Adversarial Transferability [5.803095119348021]
adversarial examples generated under the white-box setting often exhibit low transferability across different models.
In this work, we find that several tiny changes in the existing adversarial attacks can significantly affect the attack performance.
Based on careful studies of existing adversarial attacks, we propose a bag of tricks to enhance adversarial transferability.
arXiv Detail & Related papers (2024-01-16T17:42:36Z) - GE-AdvGAN: Improving the transferability of adversarial samples by
gradient editing-based adversarial generative model [69.71629949747884]
Adversarial generative models, such as Generative Adversarial Networks (GANs), are widely applied for generating various types of data.
In this work, we propose a novel algorithm named GE-AdvGAN to enhance the transferability of adversarial samples.
arXiv Detail & Related papers (2024-01-11T16:43:16Z) - Improving Adversarial Transferability by Stable Diffusion [36.97548018603747]
adversarial examples introduce imperceptible perturbations to benign samples, deceiving predictions.
Deep neural networks (DNNs) are susceptible to adversarial examples, which introduce imperceptible perturbations to benign samples, deceiving predictions.
We introduce a novel attack method called Stable Diffusion Attack Method (SDAM), which incorporates samples generated by Stable Diffusion to augment input images.
arXiv Detail & Related papers (2023-11-18T09:10:07Z) - Structure Invariant Transformation for better Adversarial
Transferability [9.272426833639615]
We propose a novel input transformation based attack, called Structure Invariant Attack (SIA)
SIA applies a random image transformation onto each image block to craft a set of diverse images for gradient calculation.
Experiments on the standard ImageNet dataset demonstrate that SIA exhibits much better transferability than the existing SOTA input transformation based attacks.
arXiv Detail & Related papers (2023-09-26T06:31:32Z) - Common Knowledge Learning for Generating Transferable Adversarial
Examples [60.1287733223249]
This paper focuses on an important type of black-box attacks, where the adversary generates adversarial examples by a substitute (source) model.
Existing methods tend to give unsatisfactory adversarial transferability when the source and target models are from different types of DNN architectures.
We propose a common knowledge learning (CKL) framework to learn better network weights to generate adversarial examples.
arXiv Detail & Related papers (2023-07-01T09:07:12Z) - Making Substitute Models More Bayesian Can Enhance Transferability of
Adversarial Examples [89.85593878754571]
transferability of adversarial examples across deep neural networks is the crux of many black-box attacks.
We advocate to attack a Bayesian model for achieving desirable transferability.
Our method outperforms recent state-of-the-arts by large margins.
arXiv Detail & Related papers (2023-02-10T07:08:13Z) - Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation.
ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations.
The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Towards Transferable Adversarial Attack against Deep Face Recognition [58.07786010689529]
Deep convolutional neural networks (DCNNs) have been found to be vulnerable to adversarial examples.
transferable adversarial examples can severely hinder the robustness of DCNNs.
We propose DFANet, a dropout-based method used in convolutional layers, which can increase the diversity of surrogate models.
We generate a new set of adversarial face pairs that can successfully attack four commercial APIs without any queries.
arXiv Detail & Related papers (2020-04-13T06:44:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.