SAIF: Sparse Adversarial and Imperceptible Attack Framework
- URL: http://arxiv.org/abs/2212.07495v2
- Date: Wed, 6 Dec 2023 10:55:40 GMT
- Title: SAIF: Sparse Adversarial and Imperceptible Attack Framework
- Authors: Tooba Imtiaz, Morgan Kohler, Jared Miller, Zifeng Wang, Mario Sznaier,
Octavia Camps, Jennifer Dy
- Abstract summary: We propose a novel attack technique called Sparse Adversarial and Interpretable Attack Framework (SAIF)
Specifically, we design imperceptible attacks that contain low-magnitude perturbations at a small number of pixels and leverage these sparse attacks to reveal the vulnerability of classifiers.
SAIF computes highly imperceptible and interpretable adversarial examples, and outperforms state-of-the-art sparse attack methods on the ImageNet dataset.
- Score: 7.025774823899217
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial attacks hamper the decision-making ability of neural networks by
perturbing the input signal. The addition of calculated small distortion to
images, for instance, can deceive a well-trained image classification network.
In this work, we propose a novel attack technique called Sparse Adversarial and
Interpretable Attack Framework (SAIF). Specifically, we design imperceptible
attacks that contain low-magnitude perturbations at a small number of pixels
and leverage these sparse attacks to reveal the vulnerability of classifiers.
We use the Frank-Wolfe (conditional gradient) algorithm to simultaneously
optimize the attack perturbations for bounded magnitude and sparsity with
$O(1/\sqrt{T})$ convergence. Empirical results show that SAIF computes highly
imperceptible and interpretable adversarial examples, and outperforms
state-of-the-art sparse attack methods on the ImageNet dataset.
Related papers
- AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries.
We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots.
We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z) - IRAD: Implicit Representation-driven Image Resampling against Adversarial Attacks [16.577595936609665]
We introduce a novel approach to counter adversarial attacks, namely, image resampling.
Image resampling transforms a discrete image into a new one, simulating the process of scene recapturing or rerendering as specified by a geometrical transformation.
We show that our method significantly enhances the adversarial robustness of diverse deep models against various attacks while maintaining high accuracy on clean images.
arXiv Detail & Related papers (2023-10-18T11:19:32Z) - Meta Adversarial Perturbations [66.43754467275967]
We show the existence of a meta adversarial perturbation (MAP)
MAP causes natural images to be misclassified with high probability after being updated through only a one-step gradient ascent update.
We show that these perturbations are not only image-agnostic, but also model-agnostic, as a single perturbation generalizes well across unseen data points and different neural network architectures.
arXiv Detail & Related papers (2021-11-19T16:01:45Z) - Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation.
ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations.
The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z) - Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm [93.80082636284922]
Sparse adversarial attacks can fool deep networks (DNNs) by only perturbing a few pixels.
Recent efforts combine it with another l_infty perturbation on magnitudes.
We propose a homotopy algorithm to tackle the sparsity and neural perturbation framework.
arXiv Detail & Related papers (2021-06-10T20:11:36Z) - Transferable Sparse Adversarial Attack [62.134905824604104]
We introduce a generator architecture to alleviate the overfitting issue and thus efficiently craft transferable sparse adversarial examples.
Our method achieves superior inference speed, 700$times$ faster than other optimization-based methods.
arXiv Detail & Related papers (2021-05-31T06:44:58Z) - Error Diffusion Halftoning Against Adversarial Examples [85.11649974840758]
Adversarial examples contain carefully crafted perturbations that can fool deep neural networks into making wrong predictions.
We propose a new image transformation defense based on error diffusion halftoning, and combine it with adversarial training to defend against adversarial examples.
arXiv Detail & Related papers (2021-01-23T07:55:02Z) - Blurring Fools the Network -- Adversarial Attacks by Feature Peak
Suppression and Gaussian Blurring [7.540176446791261]
We propose an adversarial attack demo named peak suppression (PS) by suppressing the values of peak elements in the features of the data.
Experiment results show that PS and well-designed gaussian blurring can form adversarial attacks that completely change classification results of a well-trained target network.
arXiv Detail & Related papers (2020-12-21T15:47:14Z) - Learning to Attack with Fewer Pixels: A Probabilistic Post-hoc Framework
for Refining Arbitrary Dense Adversarial Attacks [21.349059923635515]
adversarial evasion attacks are reported to be susceptible to deep neural network image classifiers.
We propose a probabilistic post-hoc framework that refines given dense attacks by significantly reducing the number of perturbed pixels.
Our framework performs adversarial attacks much faster than existing sparse attacks.
arXiv Detail & Related papers (2020-10-13T02:51:10Z) - TensorShield: Tensor-based Defense Against Adversarial Attacks on Images [7.080154188969453]
Recent studies have demonstrated that machine learning approaches like deep neural networks (DNNs) are easily fooled by adversarial attacks.
In this paper, we utilize tensor decomposition techniques as a preprocessing step to find a low-rank approximation of images which can significantly discard high-frequency perturbations.
arXiv Detail & Related papers (2020-02-18T00:39:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.