On the Limitations of Stochastic Pre-processing Defenses
- URL: http://arxiv.org/abs/2206.09491v1
- Date: Sun, 19 Jun 2022 21:54:42 GMT
- Title: On the Limitations of Stochastic Pre-processing Defenses
- Authors: Yue Gao, Ilia Shumailov, Kassem Fawaz, Nicolas Papernot
- Abstract summary: Defending against adversarial examples remains an open problem.
A common belief is that randomness at inference increases the cost of finding adversarial inputs.
In this paper, we investigate such pre-processing defenses and demonstrate that they are flawed.
- Score: 42.80542472276451
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Defending against adversarial examples remains an open problem. A common
belief is that randomness at inference increases the cost of finding
adversarial inputs. An example of such a defense is to apply a random
transformation to inputs prior to feeding them to the model. In this paper, we
empirically and theoretically investigate such stochastic pre-processing
defenses and demonstrate that they are flawed. First, we show that most
stochastic defenses are weaker than previously thought; they lack sufficient
randomness to withstand even standard attacks like projected gradient descent.
This casts doubt on a long-held assumption that stochastic defenses invalidate
attacks designed to evade deterministic defenses and force attackers to
integrate the Expectation over Transformation (EOT) concept. Second, we show
that stochastic defenses confront a trade-off between adversarial robustness
and model invariance; they become less effective as the defended model acquires
more invariance to their randomization. Future work will need to decouple these
two effects. Our code is available in the supplementary material.
Related papers
- Distributional Adversarial Loss [15.258476329309044]
A major challenge in defending against adversarial attacks is the enormous space of possible attacks that even a simple adversary might perform.
These include randomized smoothing methods that add noise to the input to take away some of the adversary's impact.
Another approach is input discretization which limits the adversary's possible number of actions.
arXiv Detail & Related papers (2024-06-05T17:03:47Z) - Confidence-driven Sampling for Backdoor Attacks [49.72680157684523]
Backdoor attacks aim to surreptitiously insert malicious triggers into DNN models, granting unauthorized control during testing scenarios.
Existing methods lack robustness against defense strategies and predominantly focus on enhancing trigger stealthiness while randomly selecting poisoned samples.
We introduce a straightforward yet highly effective sampling methodology that leverages confidence scores. Specifically, it selects samples with lower confidence scores, significantly increasing the challenge for defenders in identifying and countering these attacks.
arXiv Detail & Related papers (2023-10-08T18:57:36Z) - Hindering Adversarial Attacks with Multiple Encrypted Patch Embeddings [13.604830818397629]
We propose a new key-based defense focusing on both efficiency and robustness.
We build upon the previous defense with two major improvements: (1) efficient training and (2) optional randomization.
Experiments were carried out on the ImageNet dataset, and the proposed defense was evaluated against an arsenal of state-of-the-art attacks.
arXiv Detail & Related papers (2023-09-04T14:08:34Z) - Defending against Insertion-based Textual Backdoor Attacks via
Attribution [18.935041122443675]
We propose AttDef, an efficient attribution-based pipeline to defend against two insertion-based poisoning attacks.
Specifically, we regard the tokens with larger attribution scores as potential triggers since larger attribution words contribute more to the false prediction results.
We show that our proposed method can generalize sufficiently well in two common attack scenarios.
arXiv Detail & Related papers (2023-05-03T19:29:26Z) - Rethinking Textual Adversarial Defense for Pre-trained Language Models [79.18455635071817]
A literature review shows that pre-trained language models (PrLMs) are vulnerable to adversarial attacks.
We propose a novel metric (Degree of Anomaly) to enable current adversarial attack approaches to generate more natural and imperceptible adversarial examples.
We show that our universal defense framework achieves comparable or even higher after-attack accuracy with other specific defenses.
arXiv Detail & Related papers (2022-07-21T07:51:45Z) - Output Randomization: A Novel Defense for both White-box and Black-box
Adversarial Models [8.189696720657247]
Adversarial examples pose a threat to deep neural network models in a variety of scenarios.
We explore the use of output randomization as a defense against attacks in both the black box and white box models.
arXiv Detail & Related papers (2021-07-08T12:27:19Z) - A Game Theoretic Analysis of Additive Adversarial Attacks and Defenses [4.94950858749529]
We propose a game-theoretic framework for studying attacks and defenses which exist in equilibrium.
We show how this equilibrium defense can be approximated given finitely many samples from a data-generating distribution.
arXiv Detail & Related papers (2020-09-14T15:51:15Z) - Adversarial Example Games [51.92698856933169]
Adrial Example Games (AEG) is a framework that models the crafting of adversarial examples.
AEG provides a new way to design adversarial examples by adversarially training a generator and aversa from a given hypothesis class.
We demonstrate the efficacy of AEG on the MNIST and CIFAR-10 datasets.
arXiv Detail & Related papers (2020-07-01T19:47:23Z) - Reliable evaluation of adversarial robustness with an ensemble of
diverse parameter-free attacks [65.20660287833537]
In this paper we propose two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function.
We then combine our novel attacks with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.
arXiv Detail & Related papers (2020-03-03T18:15:55Z) - Defensive Few-shot Learning [77.82113573388133]
This paper investigates a new challenging problem called defensive few-shot learning.
It aims to learn a robust few-shot model against adversarial attacks.
The proposed framework can effectively make the existing few-shot models robust against adversarial attacks.
arXiv Detail & Related papers (2019-11-16T05:57:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.