Fighting Gradients with Gradients: Dynamic Defenses against Adversarial
Attacks
- URL: http://arxiv.org/abs/2105.08714v1
- Date: Tue, 18 May 2021 17:55:07 GMT
- Title: Fighting Gradients with Gradients: Dynamic Defenses against Adversarial
Attacks
- Authors: Dequan Wang, An Ju, Evan Shelhamer, David Wagner, Trevor Darrell
- Abstract summary: We propose dynamic defenses, to adapt the model and input during testing, by defensive entropy minimization (dent)
dent improves the robustness of adversarially-trained defenses and nominally-trained models against white-box, black-box, and adaptive attacks on CIFAR-10/100 and ImageNet.
- Score: 72.59081183040682
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial attacks optimize against models to defeat defenses. Existing
defenses are static, and stay the same once trained, even while attacks change.
We argue that models should fight back, and optimize their defenses against
attacks at test time. We propose dynamic defenses, to adapt the model and input
during testing, by defensive entropy minimization (dent). Dent alters testing,
but not training, for compatibility with existing models and train-time
defenses. Dent improves the robustness of adversarially-trained defenses and
nominally-trained models against white-box, black-box, and adaptive attacks on
CIFAR-10/100 and ImageNet. In particular, dent boosts state-of-the-art defenses
by 20+ points absolute against AutoAttack on CIFAR-10 at $\epsilon_\infty$ =
8/255.
Related papers
- Gradient Masking All-at-Once: Ensemble Everything Everywhere Is Not Robust [65.95797963483729]
Ensemble everything everywhere is a defense to adversarial examples.
We show that this defense is not robust to adversarial attack.
We then use standard adaptive attack techniques to reduce the defense's robust accuracy.
arXiv Detail & Related papers (2024-11-22T10:17:32Z) - Versatile Defense Against Adversarial Attacks on Image Recognition [2.9980620769521513]
Defending against adversarial attacks in a real-life setting can be compared to the way antivirus software works.
It appears that a defense method based on image-to-image translation may be capable of this.
The trained model has successfully improved the classification accuracy from nearly zero to an average of 86%.
arXiv Detail & Related papers (2024-03-13T01:48:01Z) - PubDef: Defending Against Transfer Attacks From Public Models [6.0012551318569285]
We propose a new practical threat model where the adversary relies on transfer attacks through publicly available surrogate models.
We evaluate the transfer attacks in this setting and propose a specialized defense method based on a game-theoretic perspective.
Under this threat model, our defense, PubDef, outperforms the state-of-the-art white-box adversarial training by a large margin with almost no loss in the normal accuracy.
arXiv Detail & Related papers (2023-10-26T17:58:08Z) - Efficient Defense Against Model Stealing Attacks on Convolutional Neural
Networks [0.548924822963045]
Model stealing attacks can lead to intellectual property theft and other security and privacy risks.
Current state-of-the-art defenses against model stealing attacks suggest adding perturbations to the prediction probabilities.
We propose a simple yet effective and efficient defense alternative.
arXiv Detail & Related papers (2023-09-04T22:25:49Z) - The Best Defense is a Good Offense: Adversarial Augmentation against
Adversarial Attacks [91.56314751983133]
$A5$ is a framework to craft a defensive perturbation to guarantee that any attack towards the input in hand will fail.
We show effective on-the-fly defensive augmentation with a robustifier network that ignores the ground truth label.
We also show how to apply $A5$ to create certifiably robust physical objects.
arXiv Detail & Related papers (2023-05-23T16:07:58Z) - Randomness in ML Defenses Helps Persistent Attackers and Hinders
Evaluators [49.52538232104449]
It is becoming increasingly imperative to design robust ML defenses.
Recent work has found that many defenses that initially resist state-of-the-art attacks can be broken by an adaptive adversary.
We take steps to simplify the design of defenses and argue that white-box defenses should eschew randomness when possible.
arXiv Detail & Related papers (2023-02-27T01:33:31Z) - Critical Checkpoints for Evaluating Defence Models Against Adversarial
Attack and Robustness [0.0]
Some common flaws are been noticed in the past defence models that were broken in very short time.
In this paper, we have suggested few checkpoints that should be taken into consideration while building and evaluating the soundness of defence models.
arXiv Detail & Related papers (2022-02-18T06:15:49Z) - What Doesn't Kill You Makes You Robust(er): Adversarial Training against
Poisons and Backdoors [57.040948169155925]
We extend the adversarial training framework to defend against (training-time) poisoning and backdoor attacks.
Our method desensitizes networks to the effects of poisoning by creating poisons during training and injecting them into training batches.
We show that this defense withstands adaptive attacks, generalizes to diverse threat models, and incurs a better performance trade-off than previous defenses.
arXiv Detail & Related papers (2021-02-26T17:54:36Z) - Beware the Black-Box: on the Robustness of Recent Defenses to
Adversarial Examples [11.117775891953018]
We expand upon the analysis of these defenses to include adaptive blackbox attacks.
Our investigation is done using two blackbox adversarial models and six widely studied adversarial attacks for CIFAR-10 and FashionNISTM datasets.
Our results paint a clear picture: defenses need both thorough white-box and blackbox analyses to be considered secure.
arXiv Detail & Related papers (2020-06-18T22:29:12Z) - Certified Defenses for Adversarial Patches [72.65524549598126]
Adversarial patch attacks are among the most practical threat models against real-world computer vision systems.
This paper studies certified and empirical defenses against patch attacks.
arXiv Detail & Related papers (2020-03-14T19:57:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.