Related papers: SAIF: Sparse Adversarial and Imperceptible Attack Framework

SAIF: Sparse Adversarial and Imperceptible Attack Framework

URL: http://arxiv.org/abs/2212.07495v2
Date: Wed, 6 Dec 2023 10:55:40 GMT
Title: SAIF: Sparse Adversarial and Imperceptible Attack Framework
Authors: Tooba Imtiaz, Morgan Kohler, Jared Miller, Zifeng Wang, Mario Sznaier, Octavia Camps, Jennifer Dy
Abstract summary: We propose a novel attack technique called Sparse Adversarial and Interpretable Attack Framework (SAIF) Specifically, we design imperceptible attacks that contain low-magnitude perturbations at a small number of pixels and leverage these sparse attacks to reveal the vulnerability of classifiers. SAIF computes highly imperceptible and interpretable adversarial examples, and outperforms state-of-the-art sparse attack methods on the ImageNet dataset.
Score: 7.025774823899217
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial attacks hamper the decision-making ability of neural networks by perturbing the input signal. The addition of calculated small distortion to images, for instance, can deceive a well-trained image classification network. In this work, we propose a novel attack technique called Sparse Adversarial and Interpretable Attack Framework (SAIF). Specifically, we design imperceptible attacks that contain low-magnitude perturbations at a small number of pixels and leverage these sparse attacks to reveal the vulnerability of classifiers. We use the Frank-Wolfe (conditional gradient) algorithm to simultaneously optimize the attack perturbations for bounded magnitude and sparsity with $O(1/\sqrt{T})$ convergence. Empirical results show that SAIF computes highly imperceptible and interpretable adversarial examples, and outperforms state-of-the-art sparse attack methods on the ImageNet dataset.

Related papers

AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries. We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots. We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z)
IRAD: Implicit Representation-driven Image Resampling against Adversarial Attacks [16.577595936609665]
We introduce a novel approach to counter adversarial attacks, namely, image resampling. Image resampling transforms a discrete image into a new one, simulating the process of scene recapturing or rerendering as specified by a geometrical transformation. We show that our method significantly enhances the adversarial robustness of diverse deep models against various attacks while maintaining high accuracy on clean images.
arXiv Detail & Related papers (2023-10-18T11:19:32Z)
Meta Adversarial Perturbations [66.43754467275967]
We show the existence of a meta adversarial perturbation (MAP) MAP causes natural images to be misclassified with high probability after being updated through only a one-step gradient ascent update. We show that these perturbations are not only image-agnostic, but also model-agnostic, as a single perturbation generalizes well across unseen data points and different neural network architectures.
arXiv Detail & Related papers (2021-11-19T16:01:45Z)
Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation. ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations. The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z)
Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm [93.80082636284922]
Sparse adversarial attacks can fool deep networks (DNNs) by only perturbing a few pixels. Recent efforts combine it with another l_infty perturbation on magnitudes. We propose a homotopy algorithm to tackle the sparsity and neural perturbation framework.
arXiv Detail & Related papers (2021-06-10T20:11:36Z)
Transferable Sparse Adversarial Attack [62.134905824604104]
We introduce a generator architecture to alleviate the overfitting issue and thus efficiently craft transferable sparse adversarial examples. Our method achieves superior inference speed, 700$times$ faster than other optimization-based methods.
arXiv Detail & Related papers (2021-05-31T06:44:58Z)
Error Diffusion Halftoning Against Adversarial Examples [85.11649974840758]
Adversarial examples contain carefully crafted perturbations that can fool deep neural networks into making wrong predictions. We propose a new image transformation defense based on error diffusion halftoning, and combine it with adversarial training to defend against adversarial examples.
arXiv Detail & Related papers (2021-01-23T07:55:02Z)
Blurring Fools the Network -- Adversarial Attacks by Feature Peak Suppression and Gaussian Blurring [7.540176446791261]
We propose an adversarial attack demo named peak suppression (PS) by suppressing the values of peak elements in the features of the data. Experiment results show that PS and well-designed gaussian blurring can form adversarial attacks that completely change classification results of a well-trained target network.
arXiv Detail & Related papers (2020-12-21T15:47:14Z)
Learning to Attack with Fewer Pixels: A Probabilistic Post-hoc Framework for Refining Arbitrary Dense Adversarial Attacks [21.349059923635515]
adversarial evasion attacks are reported to be susceptible to deep neural network image classifiers. We propose a probabilistic post-hoc framework that refines given dense attacks by significantly reducing the number of perturbed pixels. Our framework performs adversarial attacks much faster than existing sparse attacks.
arXiv Detail & Related papers (2020-10-13T02:51:10Z)
TensorShield: Tensor-based Defense Against Adversarial Attacks on Images [7.080154188969453]
Recent studies have demonstrated that machine learning approaches like deep neural networks (DNNs) are easily fooled by adversarial attacks. In this paper, we utilize tensor decomposition techniques as a preprocessing step to find a low-rank approximation of images which can significantly discard high-frequency perturbations.
arXiv Detail & Related papers (2020-02-18T00:39:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.