Boosting Adversarial Transferability with Learnable Patch-wise Masks
- URL: http://arxiv.org/abs/2306.15931v2
- Date: Mon, 11 Sep 2023 05:53:14 GMT
- Title: Boosting Adversarial Transferability with Learnable Patch-wise Masks
- Authors: Xingxing Wei, Shiji Zhao
- Abstract summary: Adversarial examples have attracted widespread attention in security-critical applications because of their transferability across different models.
In this paper, we argue that the model-specific discriminative regions are a key factor causing overfitting to the source model, and thus reducing the transferability to the target model.
To accurately localize these regions, we present a learnable approach to automatically optimize the mask.
- Score: 16.46210182214551
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial examples have attracted widespread attention in security-critical
applications because of their transferability across different models. Although
many methods have been proposed to boost adversarial transferability, a gap
still exists between capabilities and practical demand. In this paper, we argue
that the model-specific discriminative regions are a key factor causing
overfitting to the source model, and thus reducing the transferability to the
target model. For that, a patch-wise mask is utilized to prune the
model-specific regions when calculating adversarial perturbations. To
accurately localize these regions, we present a learnable approach to
automatically optimize the mask. Specifically, we simulate the target models in
our framework, and adjust the patch-wise mask according to the feedback of the
simulated models. To improve the efficiency, the differential evolutionary (DE)
algorithm is utilized to search for patch-wise masks for a specific image.
During iterative attacks, the learned masks are applied to the image to drop
out the patches related to model-specific regions, thus making the gradients
more generic and improving the adversarial transferability. The proposed
approach is a preprocessing method and can be integrated with existing methods
to further boost the transferability. Extensive experiments on the ImageNet
dataset demonstrate the effectiveness of our method. We incorporate the
proposed approach with existing methods to perform ensemble attacks and achieve
an average success rate of 93.01% against seven advanced defense methods, which
can effectively enhance the state-of-the-art transfer-based attack performance.
Related papers
- Imperceptible Face Forgery Attack via Adversarial Semantic Mask [59.23247545399068]
We propose an Adversarial Semantic Mask Attack framework (ASMA) which can generate adversarial examples with good transferability and invisibility.
Specifically, we propose a novel adversarial semantic mask generative model, which can constrain generated perturbations in local semantic regions for good stealthiness.
arXiv Detail & Related papers (2024-06-16T10:38:11Z) - Salience-Based Adaptive Masking: Revisiting Token Dynamics for Enhanced Pre-training [33.39585710223628]
Saliency-Based Adaptive Masking improves pre-training performance of MIM approaches by prioritizing token salience.
We show that our method significantly improves over the state-of-the-art in mask-based pre-training on the ImageNet-1K dataset.
arXiv Detail & Related papers (2024-04-12T08:38:51Z) - Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent
Diffusion Model [61.53213964333474]
We propose a unified framework Adv-Diffusion that can generate imperceptible adversarial identity perturbations in the latent space but not the raw pixel space.
Specifically, we propose the identity-sensitive conditioned diffusion generative model to generate semantic perturbations in the surroundings.
The designed adaptive strength-based adversarial perturbation algorithm can ensure both attack transferability and stealthiness.
arXiv Detail & Related papers (2023-12-18T15:25:23Z) - Enhancing Adversarial Attacks: The Similar Target Method [6.293148047652131]
adversarial examples pose a threat to deep neural networks' applications.
Deep neural networks are vulnerable to adversarial examples, posing a threat to the models' applications and raising security concerns.
We propose a similar targeted attack method named Similar Target(ST)
arXiv Detail & Related papers (2023-08-21T14:16:36Z) - Masked Images Are Counterfactual Samples for Robust Fine-tuning [77.82348472169335]
Fine-tuning deep learning models can lead to a trade-off between in-distribution (ID) performance and out-of-distribution (OOD) robustness.
We propose a novel fine-tuning method, which uses masked images as counterfactual samples that help improve the robustness of the fine-tuning model.
arXiv Detail & Related papers (2023-03-06T11:51:28Z) - Frequency Domain Model Augmentation for Adversarial Attack [91.36850162147678]
For black-box attacks, the gap between the substitute model and the victim model is usually large.
We propose a novel spectrum simulation attack to craft more transferable adversarial examples against both normally trained and defense models.
arXiv Detail & Related papers (2022-07-12T08:26:21Z) - Improving Transferability of Adversarial Patches on Face Recognition
with Generative Models [43.51625789744288]
We evaluate the robustness of face recognition models using adversarial patches based on transferability.
We show that the gaps between the responses of substitute models and the target models dramatically decrease, exhibiting a better transferability.
arXiv Detail & Related papers (2021-06-29T02:13:05Z) - Effective Unsupervised Domain Adaptation with Adversarially Trained
Language Models [54.569004548170824]
We show that careful masking strategies can bridge the knowledge gap of masked language models.
We propose an effective training strategy by adversarially masking out those tokens which are harder to adversarial by the underlying.
arXiv Detail & Related papers (2020-10-05T01:49:47Z) - Perturbing Across the Feature Hierarchy to Improve Standard and Strict
Blackbox Attack Transferability [100.91186458516941]
We consider the blackbox transfer-based targeted adversarial attack threat model in the realm of deep neural network (DNN) image classifiers.
We design a flexible attack framework that allows for multi-layer perturbations and demonstrates state-of-the-art targeted transfer performance.
We analyze why the proposed methods outperform existing attack strategies and show an extension of the method in the case when limited queries to the blackbox model are allowed.
arXiv Detail & Related papers (2020-04-29T16:00:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.