Randomness in ML Defenses Helps Persistent Attackers and Hinders
Evaluators
- URL: http://arxiv.org/abs/2302.13464v1
- Date: Mon, 27 Feb 2023 01:33:31 GMT
- Title: Randomness in ML Defenses Helps Persistent Attackers and Hinders
Evaluators
- Authors: Keane Lucas, Matthew Jagielski, Florian Tram\`er, Lujo Bauer, Nicholas
Carlini
- Abstract summary: It is becoming increasingly imperative to design robust ML defenses.
Recent work has found that many defenses that initially resist state-of-the-art attacks can be broken by an adaptive adversary.
We take steps to simplify the design of defenses and argue that white-box defenses should eschew randomness when possible.
- Score: 49.52538232104449
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is becoming increasingly imperative to design robust ML defenses. However,
recent work has found that many defenses that initially resist state-of-the-art
attacks can be broken by an adaptive adversary. In this work we take steps to
simplify the design of defenses and argue that white-box defenses should eschew
randomness when possible. We begin by illustrating a new issue with the
deployment of randomized defenses that reduces their security compared to their
deterministic counterparts. We then provide evidence that making defenses
deterministic simplifies robustness evaluation, without reducing the
effectiveness of a truly robust defense. Finally, we introduce a new defense
evaluation framework that leverages a defense's deterministic nature to better
evaluate its adversarial robustness.
Related papers
- Position: Towards Resilience Against Adversarial Examples [42.09231029292568]
We provide a definition of adversarial resilience and outline considerations of designing an adversarially resilient defense.
We then introduce a subproblem of adversarial resilience which we call continual adaptive robustness.
We demonstrate the connection between continual adaptive robustness and previously studied problems of multiattack robustness and unforeseen attack robustness.
arXiv Detail & Related papers (2024-05-02T14:58:44Z) - Hindering Adversarial Attacks with Multiple Encrypted Patch Embeddings [13.604830818397629]
We propose a new key-based defense focusing on both efficiency and robustness.
We build upon the previous defense with two major improvements: (1) efficient training and (2) optional randomization.
Experiments were carried out on the ImageNet dataset, and the proposed defense was evaluated against an arsenal of state-of-the-art attacks.
arXiv Detail & Related papers (2023-09-04T14:08:34Z) - Increasing Confidence in Adversarial Robustness Evaluations [53.2174171468716]
We propose a test to identify weak attacks and thus weak defense evaluations.
Our test slightly modifies a neural network to guarantee the existence of an adversarial example for every sample.
For eleven out of thirteen previously-published defenses, the original evaluation of the defense fails our test, while stronger attacks that break these defenses pass it.
arXiv Detail & Related papers (2022-06-28T13:28:13Z) - Evaluating the Adversarial Robustness of Adaptive Test-time Defenses [60.55448652445904]
We categorize such adaptive testtime defenses and explain their potential benefits and drawbacks.
Unfortunately, none significantly improve upon static models when evaluated appropriately.
Some even weaken the underlying static model while simultaneously increasing inference cost.
arXiv Detail & Related papers (2022-02-28T12:11:40Z) - Can Adversarial Training Be Manipulated By Non-Robust Features? [64.73107315313251]
Adversarial training, originally designed to resist test-time adversarial examples, has shown to be promising in mitigating training-time availability attacks.
We identify a novel threat model named stability attacks, which aims to hinder robust availability by slightly perturbing the training data.
Under this threat, we find that adversarial training using a conventional defense budget $epsilon$ provably fails to provide test robustness in a simple statistical setting.
arXiv Detail & Related papers (2022-01-31T16:25:25Z) - MAD-VAE: Manifold Awareness Defense Variational Autoencoder [0.0]
We introduce several methods to improve the robustness of defense models.
With extensive experiments on MNIST data set, we have demonstrated the effectiveness of our algorithms.
We also discuss the applicability of existing adversarial latent space attacks as they may have a significant flaw.
arXiv Detail & Related papers (2020-10-31T09:04:25Z) - A Game Theoretic Analysis of Additive Adversarial Attacks and Defenses [4.94950858749529]
We propose a game-theoretic framework for studying attacks and defenses which exist in equilibrium.
We show how this equilibrium defense can be approximated given finitely many samples from a data-generating distribution.
arXiv Detail & Related papers (2020-09-14T15:51:15Z) - Harnessing adversarial examples with a surprisingly simple defense [47.64219291655723]
I introduce a very simple method to defend against adversarial examples.
The basic idea is to raise the slope of the ReLU function at the test time.
Experiments over MNIST and CIFAR-10 datasets demonstrate the effectiveness of the proposed defense.
arXiv Detail & Related papers (2020-04-26T03:09:42Z) - Reliable evaluation of adversarial robustness with an ensemble of
diverse parameter-free attacks [65.20660287833537]
In this paper we propose two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function.
We then combine our novel attacks with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.
arXiv Detail & Related papers (2020-03-03T18:15:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.