DifAttack++: Query-Efficient Black-Box Adversarial Attack via Hierarchical Disentangled Feature Space in Cross-Domain
- URL: http://arxiv.org/abs/2406.03017v3
- Date: Mon, 1 Jul 2024 04:36:08 GMT
- Title: DifAttack++: Query-Efficient Black-Box Adversarial Attack via Hierarchical Disentangled Feature Space in Cross-Domain
- Authors: Jun Liu, Jiantao Zhou, Jiandian Zeng, Jinyu Tian, Zheng Li,
- Abstract summary: This work investigates efficient score-based black-box adversarial attacks with a high Attack Success Rate (textbfASR) and good generalizability.
We design a novel attack method based on a hierarchical DIsentangled Feature space, called textbfDifAttack++.
- Score: 23.722737138113203
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This work investigates efficient score-based black-box adversarial attacks with a high Attack Success Rate (\textbf{ASR}) and good generalizability. We design a novel attack method based on a hierarchical DIsentangled Feature space, called \textbf{DifAttack++}, which differs significantly from the existing ones operating over the entire feature space. Specifically, DifAttack++ firstly disentangles an image's latent feature into an Adversarial Feature (\textbf{AF}) and a Visual Feature (\textbf{VF}) via an autoencoder equipped with our specially designed Hierarchical Decouple-Fusion (\textbf{HDF}) module, where the AF dominates the adversarial capability of an image, while the VF largely determines its visual appearance. We train such two autoencoders for the clean and adversarial image domains (i.e., cross-domain) respectively to achieve image reconstructions and feature disentanglement, by using pairs of clean images and their Adversarial Examples (\textbf{AE}s) generated from available surrogate models via white-box attack methods. Eventually, in the black-box attack stage, DifAttack++ iteratively optimizes the AF according to the query feedback from the victim model until a successful AE is generated, while keeping the VF unaltered. Extensive experimental results demonstrate that our DifAttack++ leads to superior ASR and query efficiency than state-of-the-art methods, meanwhile exhibiting much better visual quality of AEs. The code is available at https://github.com/csjunjun/DifAttack.git.
Related papers
- Ask, Attend, Attack: A Effective Decision-Based Black-Box Targeted Attack for Image-to-Text Models [29.1607388062023]
This paper focuses on a challenging scenario: decision-based black-box targeted attacks where the attackers only have access to the final output text and aim to perform targeted attacks.
A three-stage process textitAsk, Attend, Attack, called textitAAA, is proposed to coordinate with the solver.
Experimental results on transformer-based and CNN+RNN-based image-to-text models confirmed the effectiveness of our proposed textitAAA
arXiv Detail & Related papers (2024-08-16T19:35:06Z) - Improving Adversarial Robustness via Decoupled Visual Representation Masking [65.73203518658224]
In this paper, we highlight two novel properties of robust features from the feature distribution perspective.
We find that state-of-the-art defense methods aim to address both of these mentioned issues well.
Specifically, we propose a simple but effective defense based on decoupled visual representation masking.
arXiv Detail & Related papers (2024-06-16T13:29:41Z) - Unsegment Anything by Simulating Deformation [67.10966838805132]
"Anything Unsegmentable" is a task to grant any image "the right to be unsegmented"
We aim to achieve transferable adversarial attacks against all prompt-based segmentation models.
Our approach focuses on disrupting image encoder features to achieve prompt-agnostic attacks.
arXiv Detail & Related papers (2024-04-03T09:09:42Z) - DifAttack: Query-Efficient Black-Box Attack via Disentangled Feature
Space [6.238161846680642]
This work investigates efficient score-based black-box adversarial attacks with a high Attack Success Rate (ASR) and good generalizability.
We design a novel attack method based on a Disentangled Feature space, called DifAttack, which differs significantly from the existing ones operating over the entire feature space.
arXiv Detail & Related papers (2023-09-26T00:15:13Z) - Semantic Adversarial Attacks via Diffusion Models [30.169827029761702]
Semantic adversarial attacks focus on changing semantic attributes of clean examples, such as color, context, and features.
We propose a framework to quickly generate a semantic adversarial attack by leveraging recent diffusion models.
Our approaches achieve approximately 100% attack success rate in multiple settings with the best FID as 36.61.
arXiv Detail & Related papers (2023-09-14T02:57:48Z) - Inter-frame Accelerate Attack against Video Interpolation Models [73.28751441626754]
We apply adversarial attacks to VIF models and find that the VIF models are very vulnerable to adversarial examples.
We propose a novel attack method named Inter-frame Accelerate Attack (IAA) thats the iterations as the perturbation for the previous adjacent frame.
It is shown that our method can improve attack efficiency greatly while achieving comparable attack performance with traditional methods.
arXiv Detail & Related papers (2023-05-11T03:08:48Z) - SAIF: Sparse Adversarial and Imperceptible Attack Framework [7.025774823899217]
We propose a novel attack technique called Sparse Adversarial and Interpretable Attack Framework (SAIF)
Specifically, we design imperceptible attacks that contain low-magnitude perturbations at a small number of pixels and leverage these sparse attacks to reveal the vulnerability of classifiers.
SAIF computes highly imperceptible and interpretable adversarial examples, and outperforms state-of-the-art sparse attack methods on the ImageNet dataset.
arXiv Detail & Related papers (2022-12-14T20:28:50Z) - Attackar: Attack of the Evolutionary Adversary [0.0]
This paper introduces textitAttackar, an evolutionary, score-based, black-box attack.
Attackar is based on a novel objective function that can be used in gradient-free optimization problems.
Our results demonstrate the superior performance of Attackar, both in terms of accuracy score and query efficiency.
arXiv Detail & Related papers (2022-08-17T13:57:23Z) - Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation.
ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations.
The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z) - Decision-based Universal Adversarial Attack [55.76371274622313]
In black-box setting, current universal adversarial attack methods utilize substitute models to generate the perturbation.
We propose an efficient Decision-based Universal Attack (DUAttack)
The effectiveness of DUAttack is validated through comparisons with other state-of-the-art attacks.
arXiv Detail & Related papers (2020-09-15T12:49:03Z) - Patch-wise Attack for Fooling Deep Neural Network [153.59832333877543]
We propose a patch-wise iterative algorithm -- a black-box attack towards mainstream normally trained and defense models.
We significantly improve the success rate by 9.2% for defense models and 3.7% for normally trained models on average.
arXiv Detail & Related papers (2020-07-14T01:50:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.