Saliency Attack: Towards Imperceptible Black-box Adversarial Attack
- URL: http://arxiv.org/abs/2206.01898v1
- Date: Sat, 4 Jun 2022 03:56:07 GMT
- Title: Saliency Attack: Towards Imperceptible Black-box Adversarial Attack
- Authors: Zeyu Dai, Shengcai Liu, Ke Tang, Qing Li
- Abstract summary: We propose to restrict perturbations to a small salient region to generate adversarial examples that can hardly be perceived.
We also propose the Saliency Attack, a new black-box attack aiming to refine the perturbations in the salient region to achieve even better imperceptibility.
- Score: 35.897117965803666
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Deep neural networks are vulnerable to adversarial examples, even in the
black-box setting where the attacker is only accessible to the model output.
Recent studies have devised effective black-box attacks with high query
efficiency. However, such performance is often accompanied by compromises in
attack imperceptibility, hindering the practical use of these approaches. In
this paper, we propose to restrict the perturbations to a small salient region
to generate adversarial examples that can hardly be perceived. This approach is
readily compatible with many existing black-box attacks and can significantly
improve their imperceptibility with little degradation in attack success rate.
Further, we propose the Saliency Attack, a new black-box attack aiming to
refine the perturbations in the salient region to achieve even better
imperceptibility. Extensive experiments show that compared to the
state-of-the-art black-box attacks, our approach achieves much better
imperceptibility scores, including most apparent distortion (MAD), $L_0$ and
$L_2$ distances, and also obtains significantly higher success rates judged by
a human-like threshold on MAD. Importantly, the perturbations generated by our
approach are interpretable to some extent. Finally, it is also demonstrated to
be robust to different detection-based defenses.
Related papers
- STBA: Towards Evaluating the Robustness of DNNs for Query-Limited Black-box Scenario [50.37501379058119]
We propose the Spatial Transform Black-box Attack (STBA) to craft formidable adversarial examples in the query-limited scenario.
We show that STBA could effectively improve the imperceptibility of the adversarial examples and remarkably boost the attack success rate under query-limited settings.
arXiv Detail & Related papers (2024-03-30T13:28:53Z) - Boosting Black-Box Adversarial Attacks with Meta Learning [0.0]
We propose a hybrid attack method which trains meta adversarial perturbations (MAPs) on surrogate models and performs black-box attacks by estimating gradients of the models.
Our method can not only improve the attack success rates, but also reduces the number of queries compared to other methods.
arXiv Detail & Related papers (2022-03-28T09:32:48Z) - Adversarial training may be a double-edged sword [50.09831237090801]
We show that some geometric consequences of adversarial training on the decision boundary of deep networks give an edge to certain types of black-box attacks.
In particular, we define a metric called robustness gain to show that while adversarial training is an effective method to dramatically improve the robustness in white-box scenarios, it may not provide such a good robustness gain against the more realistic decision-based black-box attacks.
arXiv Detail & Related papers (2021-07-24T19:09:16Z) - Combating Adversaries with Anti-Adversaries [118.70141983415445]
In particular, our layer generates an input perturbation in the opposite direction of the adversarial one.
We verify the effectiveness of our approach by combining our layer with both nominally and robustly trained models.
Our anti-adversary layer significantly enhances model robustness while coming at no cost on clean accuracy.
arXiv Detail & Related papers (2021-03-26T09:36:59Z) - Adversarial example generation with AdaBelief Optimizer and Crop
Invariance [8.404340557720436]
Adversarial attacks can be an important method to evaluate and select robust models in safety-critical applications.
We propose AdaBelief Iterative Fast Gradient Method (ABI-FGM) and Crop-Invariant attack Method (CIM) to improve the transferability of adversarial examples.
Our method has higher success rates than state-of-the-art gradient-based attack methods.
arXiv Detail & Related papers (2021-02-07T06:00:36Z) - Local Black-box Adversarial Attacks: A Query Efficient Approach [64.98246858117476]
Adrial attacks have threatened the application of deep neural networks in security-sensitive scenarios.
We propose a novel framework to perturb the discriminative areas of clean examples only within limited queries in black-box attacks.
We conduct extensive experiments to show that our framework can significantly improve the query efficiency during black-box perturbing with a high attack success rate.
arXiv Detail & Related papers (2021-01-04T15:32:16Z) - Perception Improvement for Free: Exploring Imperceptible Black-box
Adversarial Attacks on Image Classification [27.23874129994179]
White-box adversarial attacks can fool neural networks with small perturbations, especially for large size images.
Keeping successful adversarial perturbations imperceptible is especially challenging for transfer-based black-box adversarial attacks.
We propose structure-aware adversarial attacks by generating adversarial images based on psychological perceptual models.
arXiv Detail & Related papers (2020-10-30T07:17:12Z) - AdvMind: Inferring Adversary Intent of Black-Box Attacks [66.19339307119232]
We present AdvMind, a new class of estimation models that infer the adversary intent of black-box adversarial attacks in a robust manner.
On average AdvMind detects the adversary intent with over 75% accuracy after observing less than 3 query batches.
arXiv Detail & Related papers (2020-06-16T22:04:31Z) - Towards Query-Efficient Black-Box Adversary with Zeroth-Order Natural
Gradient Descent [92.4348499398224]
Black-box adversarial attack methods have received special attentions owing to their practicality and simplicity.
We propose a zeroth-order natural gradient descent (ZO-NGD) method to design the adversarial attacks.
ZO-NGD can obtain significantly lower model query complexities compared with state-of-the-art attack methods.
arXiv Detail & Related papers (2020-02-18T21:48:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.