Robustness Out of the Box: Compositional Representations Naturally
Defend Against Black-Box Patch Attacks
- URL: http://arxiv.org/abs/2012.00558v1
- Date: Tue, 1 Dec 2020 15:04:23 GMT
- Title: Robustness Out of the Box: Compositional Representations Naturally
Defend Against Black-Box Patch Attacks
- Authors: Christian Cosgrove, Adam Kortylewski, Chenglin Yang, Alan Yuille
- Abstract summary: Patch-based adversarial attacks introduce a perceptible but localized change to the input that induces misclassification.
In this work, we study two different approaches for defending against black-box patch attacks.
We find that adversarial training has limited effectiveness against state-of-the-art location-optimized patch attacks.
- Score: 11.429509031463892
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Patch-based adversarial attacks introduce a perceptible but localized change
to the input that induces misclassification. While progress has been made in
defending against imperceptible attacks, it remains unclear how patch-based
attacks can be resisted. In this work, we study two different approaches for
defending against black-box patch attacks. First, we show that adversarial
training, which is successful against imperceptible attacks, has limited
effectiveness against state-of-the-art location-optimized patch attacks.
Second, we find that compositional deep networks, which have part-based
representations that lead to innate robustness to natural occlusion, are robust
to patch attacks on PASCAL3D+ and the German Traffic Sign Recognition
Benchmark, without adversarial training. Moreover, the robustness of
compositional models outperforms that of adversarially trained standard models
by a large margin. However, on GTSRB, we observe that they have problems
discriminating between similar traffic signs with fine-grained differences. We
overcome this limitation by introducing part-based finetuning, which improves
fine-grained recognition. By leveraging compositional representations, this is
the first work that defends against black-box patch attacks without expensive
adversarial training. This defense is more robust than adversarial training and
more interpretable because it can locate and ignore adversarial patches.
Related papers
- Improving Adversarial Robustness via Decoupled Visual Representation Masking [65.73203518658224]
In this paper, we highlight two novel properties of robust features from the feature distribution perspective.
We find that state-of-the-art defense methods aim to address both of these mentioned issues well.
Specifically, we propose a simple but effective defense based on decoupled visual representation masking.
arXiv Detail & Related papers (2024-06-16T13:29:41Z) - Guidance Through Surrogate: Towards a Generic Diagnostic Attack [101.36906370355435]
We develop a guided mechanism to avoid local minima during attack optimization, leading to a novel attack dubbed Guided Projected Gradient Attack (G-PGA)
Our modified attack does not require random restarts, large number of attack iterations or search for an optimal step-size.
More than an effective attack, G-PGA can be used as a diagnostic tool to reveal elusive robustness due to gradient masking in adversarial defenses.
arXiv Detail & Related papers (2022-12-30T18:45:23Z) - Game Theoretic Mixed Experts for Combinational Adversarial Machine
Learning [10.368343314144553]
We provide a game-theoretic framework for ensemble adversarial attacks and defenses.
We propose three new attack algorithms, specifically designed to target defenses with randomized transformations, multi-model voting schemes, and adversarial detector architectures.
arXiv Detail & Related papers (2022-11-26T21:35:01Z) - Generative Dynamic Patch Attack [6.1863763890100065]
We propose an end-to-end patch attack algorithm, Generative Dynamic Patch Attack (GDPA)
GDPA generates both patch pattern and patch location adversarially for each input image.
Experiments on VGGFace, Traffic Sign and ImageNet show that GDPA achieves higher attack success rates than state-of-the-art patch attacks.
arXiv Detail & Related papers (2021-11-08T04:15:34Z) - Adversarial training may be a double-edged sword [50.09831237090801]
We show that some geometric consequences of adversarial training on the decision boundary of deep networks give an edge to certain types of black-box attacks.
In particular, we define a metric called robustness gain to show that while adversarial training is an effective method to dramatically improve the robustness in white-box scenarios, it may not provide such a good robustness gain against the more realistic decision-based black-box attacks.
arXiv Detail & Related papers (2021-07-24T19:09:16Z) - Grey-box Adversarial Attack And Defence For Sentiment Classification [19.466940655682727]
We introduce a grey-box adversarial attack and defence framework for sentiment classification.
We address the issues of differentiability, label preservation and input reconstruction for adversarial attack and defence in one unified framework.
arXiv Detail & Related papers (2021-03-22T04:05:17Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z) - Adversarial Feature Desensitization [12.401175943131268]
We propose a novel approach to adversarial robustness, which builds upon the insights from the domain adaptation field.
Our method, called Adversarial Feature Desensitization (AFD), aims at learning features that are invariant towards adversarial perturbations of the inputs.
arXiv Detail & Related papers (2020-06-08T14:20:02Z) - Adversarial Training against Location-Optimized Adversarial Patches [84.96938953835249]
adversarial patches: clearly visible, but adversarially crafted rectangular patches in images.
We first devise a practical approach to obtain adversarial patches while actively optimizing their location within the image.
We apply adversarial training on these location-optimized adversarial patches and demonstrate significantly improved robustness on CIFAR10 and GTSRB.
arXiv Detail & Related papers (2020-05-05T16:17:00Z) - Certified Defenses for Adversarial Patches [72.65524549598126]
Adversarial patch attacks are among the most practical threat models against real-world computer vision systems.
This paper studies certified and empirical defenses against patch attacks.
arXiv Detail & Related papers (2020-03-14T19:57:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.