PatchCleanser: Certifiably Robust Defense against Adversarial Patches
for Any Image Classifier
- URL: http://arxiv.org/abs/2108.09135v1
- Date: Fri, 20 Aug 2021 12:09:33 GMT
- Title: PatchCleanser: Certifiably Robust Defense against Adversarial Patches
for Any Image Classifier
- Authors: Chong Xiang, Saeed Mahloujifar, Prateek Mittal
- Abstract summary: adversarial patch attack against image classification models aims to inject adversarially crafted pixels within a localized restricted image region (i.e., a patch)
We propose PatchCleanser as a robust defense against adversarial patches that is compatible with any image classification model.
We extensively evaluate our defense on the ImageNet, ImageNette, CIFAR-10, CIFAR-100, SVHN, and Flowers-102 datasets.
- Score: 30.559585856170216
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The adversarial patch attack against image classification models aims to
inject adversarially crafted pixels within a localized restricted image region
(i.e., a patch) for inducing model misclassification. This attack can be
realized in the physical world by printing and attaching the patch to the
victim object and thus imposes a real-world threat to computer vision systems.
To counter this threat, we propose PatchCleanser as a certifiably robust
defense against adversarial patches that is compatible with any image
classifier. In PatchCleanser, we perform two rounds of pixel masking on the
input image to neutralize the effect of the adversarial patch. In the first
round of masking, we apply a set of carefully generated masks to the input
image and evaluate the model prediction on every masked image. If model
predictions on all one-masked images reach a unanimous agreement, we output the
agreed prediction label. Otherwise, we perform a second round of masking to
settle the disagreement, in which we evaluate model predictions on two-masked
images to robustly recover the correct prediction label. Notably, we can prove
that our defense will always make correct predictions on certain images against
any adaptive white-box attacker within our threat model, achieving certified
robustness. We extensively evaluate our defense on the ImageNet, ImageNette,
CIFAR-10, CIFAR-100, SVHN, and Flowers-102 datasets and demonstrate that our
defense achieves similar clean accuracy as state-of-the-art classification
models and also significantly improves certified robustness from prior works.
Notably, our defense can achieve 83.8% top-1 clean accuracy and 60.4% top-1
certified robust accuracy against a 2%-pixel square patch anywhere on the
1000-class ImageNet dataset.
Related papers
- Towards Robust Image Stitching: An Adaptive Resistance Learning against
Compatible Attacks [66.98297584796391]
Image stitching seamlessly integrates images captured from varying perspectives into a single wide field-of-view image.
Given a pair of captured images, subtle perturbations and distortions which go unnoticed by the human visual system tend to attack the correspondence matching.
This paper presents the first attempt to improve the robustness of image stitching against adversarial attacks.
arXiv Detail & Related papers (2024-02-25T02:36:33Z) - Revisiting Image Classifier Training for Improved Certified Robust
Defense against Adversarial Patches [7.90470727433401]
We propose a two-round greedy masking strategy (Greedy Cutout) which finds an approximate worst-case mask location with much less compute.
We show that the models trained with our Greedy Cutout improves certified robust accuracy over Random Cutout in PatchCleanser across a range of datasets.
arXiv Detail & Related papers (2023-06-22T00:13:44Z) - Task-agnostic Defense against Adversarial Patch Attacks [25.15948648034204]
Adversarial patch attacks mislead neural networks by injecting adversarial pixels within a designated local region.
We present PatchZero, a task-agnostic defense against white-box adversarial patches.
Our method achieves SOTA robust accuracy without any degradation in the benign performance.
arXiv Detail & Related papers (2022-07-05T03:49:08Z) - Towards Practical Certifiable Patch Defense with Vision Transformer [34.00374565048962]
We introduce Vision Transformer (ViT) into the framework of Derandomized Smoothing (DS)
For efficient inference and deployment in the real world, we innovatively reconstruct the global self-attention structure of the original ViT into isolated band unit self-attention.
arXiv Detail & Related papers (2022-03-16T10:39:18Z) - Segment and Complete: Defending Object Detectors against Adversarial
Patch Attacks with Robust Patch Detection [142.24869736769432]
Adversarial patch attacks pose a serious threat to state-of-the-art object detectors.
We propose Segment and Complete defense (SAC), a framework for defending object detectors against patch attacks.
We show SAC can significantly reduce the targeted attack success rate of physical patch attacks.
arXiv Detail & Related papers (2021-12-08T19:18:48Z) - PatchGuard++: Efficient Provable Attack Detection against Adversarial
Patches [28.94435153159868]
An adversarial patch can arbitrarily manipulate image pixels within a restricted region to induce model misclassification.
Recent provably robust defenses generally follow the PatchGuard framework by using CNNs with small receptive fields.
We extend PatchGuard to PatchGuard++ for provably detecting the adversarial patch attack to boost both provable robust accuracy and clean accuracy.
arXiv Detail & Related papers (2021-04-26T14:22:33Z) - FaceGuard: A Self-Supervised Defense Against Adversarial Face Images [59.656264895721215]
We propose a new self-supervised adversarial defense framework, namely FaceGuard, that can automatically detect, localize, and purify a wide variety of adversarial faces.
During training, FaceGuard automatically synthesizes challenging and diverse adversarial attacks, enabling a classifier to learn to distinguish them from real faces.
Experimental results on LFW dataset show that FaceGuard can achieve 99.81% detection accuracy on six unseen adversarial attack types.
arXiv Detail & Related papers (2020-11-28T21:18:46Z) - PatchGuard: A Provably Robust Defense against Adversarial Patches via
Small Receptive Fields and Masking [46.03749650789915]
Localized adversarial patches aim to induce misclassification in machine learning models by arbitrarily modifying pixels within a restricted region of an image.
We propose a general defense framework called PatchGuard that can achieve high provable robustness while maintaining high clean accuracy against localized adversarial patches.
arXiv Detail & Related papers (2020-05-17T03:38:34Z) - Certified Defenses for Adversarial Patches [72.65524549598126]
Adversarial patch attacks are among the most practical threat models against real-world computer vision systems.
This paper studies certified and empirical defenses against patch attacks.
arXiv Detail & Related papers (2020-03-14T19:57:31Z) - (De)Randomized Smoothing for Certifiable Defense against Patch Attacks [136.79415677706612]
We introduce a certifiable defense against patch attacks that guarantees for a given image and patch attack size.
Our method is related to the broad class of randomized smoothing robustness schemes.
Our results effectively establish a new state-of-the-art of certifiable defense against patch attacks on CIFAR-10 and ImageNet.
arXiv Detail & Related papers (2020-02-25T08:39:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.