Related papers: Robustness Out of the Box: Compositional Representations Naturally Defend Against Black-Box Patch Attacks

Robustness Out of the Box: Compositional Representations Naturally Defend Against Black-Box Patch Attacks

URL: http://arxiv.org/abs/2012.00558v1
Date: Tue, 1 Dec 2020 15:04:23 GMT
Title: Robustness Out of the Box: Compositional Representations Naturally Defend Against Black-Box Patch Attacks
Authors: Christian Cosgrove, Adam Kortylewski, Chenglin Yang, Alan Yuille
Abstract summary: Patch-based adversarial attacks introduce a perceptible but localized change to the input that induces misclassification. In this work, we study two different approaches for defending against black-box patch attacks. We find that adversarial training has limited effectiveness against state-of-the-art location-optimized patch attacks.
Score: 11.429509031463892
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Patch-based adversarial attacks introduce a perceptible but localized change to the input that induces misclassification. While progress has been made in defending against imperceptible attacks, it remains unclear how patch-based attacks can be resisted. In this work, we study two different approaches for defending against black-box patch attacks. First, we show that adversarial training, which is successful against imperceptible attacks, has limited effectiveness against state-of-the-art location-optimized patch attacks. Second, we find that compositional deep networks, which have part-based representations that lead to innate robustness to natural occlusion, are robust to patch attacks on PASCAL3D+ and the German Traffic Sign Recognition Benchmark, without adversarial training. Moreover, the robustness of compositional models outperforms that of adversarially trained standard models by a large margin. However, on GTSRB, we observe that they have problems discriminating between similar traffic signs with fine-grained differences. We overcome this limitation by introducing part-based finetuning, which improves fine-grained recognition. By leveraging compositional representations, this is the first work that defends against black-box patch attacks without expensive adversarial training. This defense is more robust than adversarial training and more interpretable because it can locate and ignore adversarial patches.

Related papers

Mind the Gap: Detecting Black-box Adversarial Attacks in the Making through Query Update Analysis [3.795071937009966]
Adrial attacks can jeopardize the integrity of Machine Learning (ML) models. We propose a framework that detects if an adversarial noise instance is being generated. We evaluate our approach against 8 state-of-the-art attacks, including adaptive attacks.
arXiv Detail & Related papers (2025-03-04T20:25:12Z)
Improving Adversarial Robustness via Decoupled Visual Representation Masking [65.73203518658224]
In this paper, we highlight two novel properties of robust features from the feature distribution perspective. We find that state-of-the-art defense methods aim to address both of these mentioned issues well. Specifically, we propose a simple but effective defense based on decoupled visual representation masking.
arXiv Detail & Related papers (2024-06-16T13:29:41Z)
Guidance Through Surrogate: Towards a Generic Diagnostic Attack [101.36906370355435]
We develop a guided mechanism to avoid local minima during attack optimization, leading to a novel attack dubbed Guided Projected Gradient Attack (G-PGA) Our modified attack does not require random restarts, large number of attack iterations or search for an optimal step-size. More than an effective attack, G-PGA can be used as a diagnostic tool to reveal elusive robustness due to gradient masking in adversarial defenses.
arXiv Detail & Related papers (2022-12-30T18:45:23Z)
Game Theoretic Mixed Experts for Combinational Adversarial Machine Learning [10.368343314144553]
We provide a game-theoretic framework for ensemble adversarial attacks and defenses. We propose three new attack algorithms, specifically designed to target defenses with randomized transformations, multi-model voting schemes, and adversarial detector architectures.
arXiv Detail & Related papers (2022-11-26T21:35:01Z)
Generative Dynamic Patch Attack [6.1863763890100065]
We propose an end-to-end patch attack algorithm, Generative Dynamic Patch Attack (GDPA) GDPA generates both patch pattern and patch location adversarially for each input image. Experiments on VGGFace, Traffic Sign and ImageNet show that GDPA achieves higher attack success rates than state-of-the-art patch attacks.
arXiv Detail & Related papers (2021-11-08T04:15:34Z)
Adversarial training may be a double-edged sword [50.09831237090801]
We show that some geometric consequences of adversarial training on the decision boundary of deep networks give an edge to certain types of black-box attacks. In particular, we define a metric called robustness gain to show that while adversarial training is an effective method to dramatically improve the robustness in white-box scenarios, it may not provide such a good robustness gain against the more realistic decision-based black-box attacks.
arXiv Detail & Related papers (2021-07-24T19:09:16Z)
Grey-box Adversarial Attack And Defence For Sentiment Classification [19.466940655682727]
We introduce a grey-box adversarial attack and defence framework for sentiment classification. We address the issues of differentiability, label preservation and input reconstruction for adversarial attack and defence in one unified framework.
arXiv Detail & Related papers (2021-03-22T04:05:17Z)
A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems. This paper proposes a self-supervised adversarial training mechanism in the input space. It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)
Adversarial Feature Desensitization [12.401175943131268]
We propose a novel approach to adversarial robustness, which builds upon the insights from the domain adaptation field. Our method, called Adversarial Feature Desensitization (AFD), aims at learning features that are invariant towards adversarial perturbations of the inputs.
arXiv Detail & Related papers (2020-06-08T14:20:02Z)
Adversarial Training against Location-Optimized Adversarial Patches [84.96938953835249]
adversarial patches: clearly visible, but adversarially crafted rectangular patches in images. We first devise a practical approach to obtain adversarial patches while actively optimizing their location within the image. We apply adversarial training on these location-optimized adversarial patches and demonstrate significantly improved robustness on CIFAR10 and GTSRB.
arXiv Detail & Related papers (2020-05-05T16:17:00Z)
Certified Defenses for Adversarial Patches [72.65524549598126]
Adversarial patch attacks are among the most practical threat models against real-world computer vision systems. This paper studies certified and empirical defenses against patch attacks.
arXiv Detail & Related papers (2020-03-14T19:57:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.