Beware the Black-Box: on the Robustness of Recent Defenses to
Adversarial Examples
- URL: http://arxiv.org/abs/2006.10876v2
- Date: Thu, 20 May 2021 19:55:01 GMT
- Title: Beware the Black-Box: on the Robustness of Recent Defenses to
Adversarial Examples
- Authors: Kaleel Mahmood, Deniz Gurevin, Marten van Dijk, Phuong Ha Nguyen
- Abstract summary: We expand upon the analysis of these defenses to include adaptive blackbox attacks.
Our investigation is done using two blackbox adversarial models and six widely studied adversarial attacks for CIFAR-10 and FashionNISTM datasets.
Our results paint a clear picture: defenses need both thorough white-box and blackbox analyses to be considered secure.
- Score: 11.117775891953018
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many defenses have recently been proposed at venues like NIPS, ICML, ICLR and
CVPR. These defenses are mainly focused on mitigating white-box attacks. They
do not properly examine black-box attacks. In this paper, we expand upon the
analysis of these defenses to include adaptive black-box adversaries. Our
evaluation is done on nine defenses including Barrage of Random Transforms,
ComDefend, Ensemble Diversity, Feature Distillation, The Odds are Odd, Error
Correcting Codes, Distribution Classifier Defense, K-Winner Take All and Buffer
Zones. Our investigation is done using two black-box adversarial models and six
widely studied adversarial attacks for CIFAR-10 and Fashion-MNIST datasets. Our
analyses show most recent defenses (7 out of 9) provide only marginal
improvements in security ($<25\%$), as compared to undefended networks. For
every defense, we also show the relationship between the amount of data the
adversary has at their disposal, and the effectiveness of adaptive black-box
attacks. Overall, our results paint a clear picture: defenses need both
thorough white-box and black-box analyses to be considered secure. We provide
this large scale study and analyses to motivate the field to move towards the
development of more robust black-box defenses.
Related papers
- Privacy-preserving Universal Adversarial Defense for Black-box Models [20.968518031455503]
We introduce DUCD, a universal black-box defense method that does not require access to the target model's parameters or architecture.
Our approach involves querying the target model by querying it with data, creating a white-box surrogate while preserving data privacy.
Experiments on multiple image classification datasets show that DUCD not only outperforms existing black-box defenses but also matches the accuracy of white-box defenses.
arXiv Detail & Related papers (2024-08-20T08:40:39Z) - Counter-Samples: A Stateless Strategy to Neutralize Black Box Adversarial Attacks [2.9815109163161204]
Our paper presents a novel defence against black box attacks, where attackers use the victim model as an oracle to craft their adversarial examples.
Unlike traditional preprocessing defences that rely on sanitizing input samples, our strategy counters the attack process itself.
We demonstrate that our approach is remarkably effective against state-of-the-art black box attacks and outperforms existing defences for both the CIFAR-10 and ImageNet datasets.
arXiv Detail & Related papers (2024-03-14T10:59:54Z) - The Best Defense is a Good Offense: Adversarial Augmentation against
Adversarial Attacks [91.56314751983133]
$A5$ is a framework to craft a defensive perturbation to guarantee that any attack towards the input in hand will fail.
We show effective on-the-fly defensive augmentation with a robustifier network that ignores the ground truth label.
We also show how to apply $A5$ to create certifiably robust physical objects.
arXiv Detail & Related papers (2023-05-23T16:07:58Z) - Stateful Defenses for Machine Learning Models Are Not Yet Secure Against
Black-box Attacks [28.93464970650329]
We show that stateful defense models (SDMs) are highly vulnerable to a new class of adaptive black-box attacks.
We propose a novel adaptive black-box attack strategy called Oracle-guided Adaptive Rejection Sampling (OARS)
We show how to apply the strategy to enhance six common black-box attacks to be more effective against current class of SDMs.
arXiv Detail & Related papers (2023-03-11T02:10:21Z) - Randomness in ML Defenses Helps Persistent Attackers and Hinders
Evaluators [49.52538232104449]
It is becoming increasingly imperative to design robust ML defenses.
Recent work has found that many defenses that initially resist state-of-the-art attacks can be broken by an adaptive adversary.
We take steps to simplify the design of defenses and argue that white-box defenses should eschew randomness when possible.
arXiv Detail & Related papers (2023-02-27T01:33:31Z) - Are Defenses for Graph Neural Networks Robust? [72.1389952286628]
We show that most Graph Neural Networks (GNNs) defenses show no or only marginal improvement compared to an undefended baseline.
We advocate using custom adaptive attacks as a gold standard and we outline the lessons we learned from successfully designing such attacks.
Our diverse collection of perturbed graphs forms a (black-box) unit test offering a first glance at a model's robustness.
arXiv Detail & Related papers (2023-01-31T15:11:48Z) - Adversarial Defense via Image Denoising with Chaotic Encryption [65.48888274263756]
We propose a novel defense that assumes everything but a private key will be made available to the attacker.
Our framework uses an image denoising procedure coupled with encryption via a discretized Baker map.
arXiv Detail & Related papers (2022-03-19T10:25:02Z) - Output Randomization: A Novel Defense for both White-box and Black-box
Adversarial Models [8.189696720657247]
Adversarial examples pose a threat to deep neural network models in a variety of scenarios.
We explore the use of output randomization as a defense against attacks in both the black box and white box models.
arXiv Detail & Related papers (2021-07-08T12:27:19Z) - Fighting Gradients with Gradients: Dynamic Defenses against Adversarial
Attacks [72.59081183040682]
We propose dynamic defenses, to adapt the model and input during testing, by defensive entropy minimization (dent)
dent improves the robustness of adversarially-trained defenses and nominally-trained models against white-box, black-box, and adaptive attacks on CIFAR-10/100 and ImageNet.
arXiv Detail & Related papers (2021-05-18T17:55:07Z) - Theoretical Study of Random Noise Defense against Query-Based Black-Box
Attacks [72.8152874114382]
In this work, we study a simple but promising defense technique, dubbed Random Noise Defense (RND) against query-based black-box attacks.
It is lightweight and can be directly combined with any off-the-shelf models and other defense strategies.
In this work, we present solid theoretical analyses to demonstrate that the defense effect of RND against the query-based black-box attack and the corresponding adaptive attack heavily depends on the magnitude ratio.
arXiv Detail & Related papers (2021-04-23T08:39:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.