Related papers: Beware the Black-Box: on the Robustness of Recent Defenses to Adversarial Examples

Beware the Black-Box: on the Robustness of Recent Defenses to Adversarial Examples

URL: http://arxiv.org/abs/2006.10876v2
Date: Thu, 20 May 2021 19:55:01 GMT
Title: Beware the Black-Box: on the Robustness of Recent Defenses to Adversarial Examples
Authors: Kaleel Mahmood, Deniz Gurevin, Marten van Dijk, Phuong Ha Nguyen
Abstract summary: We expand upon the analysis of these defenses to include adaptive blackbox attacks. Our investigation is done using two blackbox adversarial models and six widely studied adversarial attacks for CIFAR-10 and FashionNISTM datasets. Our results paint a clear picture: defenses need both thorough white-box and blackbox analyses to be considered secure.
Score: 11.117775891953018
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Many defenses have recently been proposed at venues like NIPS, ICML, ICLR and CVPR. These defenses are mainly focused on mitigating white-box attacks. They do not properly examine black-box attacks. In this paper, we expand upon the analysis of these defenses to include adaptive black-box adversaries. Our evaluation is done on nine defenses including Barrage of Random Transforms, ComDefend, Ensemble Diversity, Feature Distillation, The Odds are Odd, Error Correcting Codes, Distribution Classifier Defense, K-Winner Take All and Buffer Zones. Our investigation is done using two black-box adversarial models and six widely studied adversarial attacks for CIFAR-10 and Fashion-MNIST datasets. Our analyses show most recent defenses (7 out of 9) provide only marginal improvements in security ($<25\%$), as compared to undefended networks. For every defense, we also show the relationship between the amount of data the adversary has at their disposal, and the effectiveness of adaptive black-box attacks. Overall, our results paint a clear picture: defenses need both thorough white-box and black-box analyses to be considered secure. We provide this large scale study and analyses to motivate the field to move towards the development of more robust black-box defenses.

Related papers

Privacy-preserving Universal Adversarial Defense for Black-box Models [20.968518031455503]
We introduce DUCD, a universal black-box defense method that does not require access to the target model's parameters or architecture. Our approach involves querying the target model by querying it with data, creating a white-box surrogate while preserving data privacy. Experiments on multiple image classification datasets show that DUCD not only outperforms existing black-box defenses but also matches the accuracy of white-box defenses.
arXiv Detail & Related papers (2024-08-20T08:40:39Z)
Counter-Samples: A Stateless Strategy to Neutralize Black Box Adversarial Attacks [2.9815109163161204]
Our paper presents a novel defence against black box attacks, where attackers use the victim model as an oracle to craft their adversarial examples. Unlike traditional preprocessing defences that rely on sanitizing input samples, our strategy counters the attack process itself. We demonstrate that our approach is remarkably effective against state-of-the-art black box attacks and outperforms existing defences for both the CIFAR-10 and ImageNet datasets.
arXiv Detail & Related papers (2024-03-14T10:59:54Z)
The Best Defense is a Good Offense: Adversarial Augmentation against Adversarial Attacks [91.56314751983133]
$A5$ is a framework to craft a defensive perturbation to guarantee that any attack towards the input in hand will fail. We show effective on-the-fly defensive augmentation with a robustifier network that ignores the ground truth label. We also show how to apply $A5$ to create certifiably robust physical objects.
arXiv Detail & Related papers (2023-05-23T16:07:58Z)
Stateful Defenses for Machine Learning Models Are Not Yet Secure Against Black-box Attacks [28.93464970650329]
We show that stateful defense models (SDMs) are highly vulnerable to a new class of adaptive black-box attacks. We propose a novel adaptive black-box attack strategy called Oracle-guided Adaptive Rejection Sampling (OARS) We show how to apply the strategy to enhance six common black-box attacks to be more effective against current class of SDMs.
arXiv Detail & Related papers (2023-03-11T02:10:21Z)
Randomness in ML Defenses Helps Persistent Attackers and Hinders Evaluators [49.52538232104449]
It is becoming increasingly imperative to design robust ML defenses. Recent work has found that many defenses that initially resist state-of-the-art attacks can be broken by an adaptive adversary. We take steps to simplify the design of defenses and argue that white-box defenses should eschew randomness when possible.
arXiv Detail & Related papers (2023-02-27T01:33:31Z)
Are Defenses for Graph Neural Networks Robust? [72.1389952286628]
We show that most Graph Neural Networks (GNNs) defenses show no or only marginal improvement compared to an undefended baseline. We advocate using custom adaptive attacks as a gold standard and we outline the lessons we learned from successfully designing such attacks. Our diverse collection of perturbed graphs forms a (black-box) unit test offering a first glance at a model's robustness.
arXiv Detail & Related papers (2023-01-31T15:11:48Z)
Adversarial Defense via Image Denoising with Chaotic Encryption [65.48888274263756]
We propose a novel defense that assumes everything but a private key will be made available to the attacker. Our framework uses an image denoising procedure coupled with encryption via a discretized Baker map.
arXiv Detail & Related papers (2022-03-19T10:25:02Z)
Output Randomization: A Novel Defense for both White-box and Black-box Adversarial Models [8.189696720657247]
Adversarial examples pose a threat to deep neural network models in a variety of scenarios. We explore the use of output randomization as a defense against attacks in both the black box and white box models.
arXiv Detail & Related papers (2021-07-08T12:27:19Z)
Fighting Gradients with Gradients: Dynamic Defenses against Adversarial Attacks [72.59081183040682]
We propose dynamic defenses, to adapt the model and input during testing, by defensive entropy minimization (dent) dent improves the robustness of adversarially-trained defenses and nominally-trained models against white-box, black-box, and adaptive attacks on CIFAR-10/100 and ImageNet.
arXiv Detail & Related papers (2021-05-18T17:55:07Z)
Theoretical Study of Random Noise Defense against Query-Based Black-Box Attacks [72.8152874114382]
In this work, we study a simple but promising defense technique, dubbed Random Noise Defense (RND) against query-based black-box attacks. It is lightweight and can be directly combined with any off-the-shelf models and other defense strategies. In this work, we present solid theoretical analyses to demonstrate that the defense effect of RND against the query-based black-box attack and the corresponding adaptive attack heavily depends on the magnitude ratio.
arXiv Detail & Related papers (2021-04-23T08:39:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.