Related papers: Adversarial Attacks are Reversible with Natural Supervision

Adversarial Attacks are Reversible with Natural Supervision

URL: http://arxiv.org/abs/2103.14222v2
Date: Mon, 29 Mar 2021 02:34:39 GMT
Title: Adversarial Attacks are Reversible with Natural Supervision
Authors: Chengzhi Mao, Mia Chiquier, Hao Wang, Junfeng Yang, Carl Vondrick
Abstract summary: Images contain intrinsic structure that enables the reversal of many adversarial attacks. We demonstrate that modifying the attacked image to restore the natural structure will reverse many types of attacks. Our results suggest deep networks are vulnerable to adversarial examples partly because their representations do not enforce the natural structure of images.
Score: 28.61536318614705
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We find that images contain intrinsic structure that enables the reversal of many adversarial attacks. Attack vectors cause not only image classifiers to fail, but also collaterally disrupt incidental structure in the image. We demonstrate that modifying the attacked image to restore the natural structure will reverse many types of attacks, providing a defense. Experiments demonstrate significantly improved robustness for several state-of-the-art models across the CIFAR-10, CIFAR-100, SVHN, and ImageNet datasets. Our results show that our defense is still effective even if the attacker is aware of the defense mechanism. Since our defense is deployed during inference instead of training, it is compatible with pre-trained networks as well as most other defenses. Our results suggest deep networks are vulnerable to adversarial examples partly because their representations do not enforce the natural structure of images.

Related papers

Gradient Masking All-at-Once: Ensemble Everything Everywhere Is Not Robust [65.95797963483729]
Ensemble everything everywhere is a defense to adversarial examples. We show that this defense is not robust to adversarial attack. We then use standard adaptive attack techniques to reduce the defense's robust accuracy.
arXiv Detail & Related papers (2024-11-22T10:17:32Z)
BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning [85.2564206440109]
This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defenses. We introduce the emphtoolns attack, which is resistant to backdoor detection and model fine-tuning defenses.
arXiv Detail & Related papers (2023-11-20T02:21:49Z)
The Best Defense is a Good Offense: Adversarial Augmentation against Adversarial Attacks [91.56314751983133]
$A5$ is a framework to craft a defensive perturbation to guarantee that any attack towards the input in hand will fail. We show effective on-the-fly defensive augmentation with a robustifier network that ignores the ground truth label. We also show how to apply $A5$ to create certifiably robust physical objects.
arXiv Detail & Related papers (2023-05-23T16:07:58Z)
Increasing Confidence in Adversarial Robustness Evaluations [53.2174171468716]
We propose a test to identify weak attacks and thus weak defense evaluations. Our test slightly modifies a neural network to guarantee the existence of an adversarial example for every sample. For eleven out of thirteen previously-published defenses, the original evaluation of the defense fails our test, while stronger attacks that break these defenses pass it.
arXiv Detail & Related papers (2022-06-28T13:28:13Z)
Towards Robust Stacked Capsule Autoencoder with Hybrid Adversarial Training [0.0]
Capsule networks (CapsNets) are new neural networks that classify images based on the spatial relationships of features. The stacked capsule autoencoder (SCAE) is a state-of-the-art CapsNet, and achieved unsupervised classification of CapsNets for the first time. We propose an evasion attack against SCAE, where the attacker can generate adversarial perturbations based on reducing the contribution of the object capsules. We evaluate the defense method and the experimental results show that the refined SCAE model can achieve 82.14% classification accuracy under evasion attack.
arXiv Detail & Related papers (2022-02-28T13:17:21Z)
Preemptive Image Robustification for Protecting Users against Man-in-the-Middle Adversarial Attacks [16.017328736786922]
A Man-in-the-Middle adversary maliciously intercepts and perturbs images web users upload online. This type of attack can raise severe ethical concerns on top of simple performance degradation. We devise a novel bi-level optimization algorithm that finds points in the vicinity of natural images that are robust to adversarial perturbations.
arXiv Detail & Related papers (2021-12-10T16:06:03Z)
Adversarial Purification through Representation Disentanglement [21.862799765511976]
Deep learning models are vulnerable to adversarial examples and make incomprehensible mistakes. Current defense methods, especially purification, tend to remove noise" by learning and recovering the natural images. In this work, we propose a novel adversarial purification scheme by presenting disentanglement of natural images and adversarial perturbations as a preprocessing defense.
arXiv Detail & Related papers (2021-10-15T01:45:31Z)
Online Alternate Generator against Adversarial Attacks [144.45529828523408]
Deep learning models are notoriously sensitive to adversarial examples which are synthesized by adding quasi-perceptible noises on real images. We propose a portable defense method, online alternate generator, which does not need to access or modify the parameters of the target networks. The proposed method works by online synthesizing another image from scratch for an input image, instead of removing or destroying adversarial noises.
arXiv Detail & Related papers (2020-09-17T07:11:16Z)
AdvFoolGen: Creating Persistent Troubles for Deep Classifiers [17.709146615433458]
We present a new black-box attack termed AdvFoolGen, which can generate attacking images from the same feature space as that of the natural images. We demonstrate the effectiveness and robustness of our attack in the face of state-of-the-art defense techniques.
arXiv Detail & Related papers (2020-07-20T21:27:41Z)
GraCIAS: Grassmannian of Corrupted Images for Adversarial Security [4.259219671110274]
In this work, we propose a defense strategy that applies random image corruptions to the input image alone. We develop proximity relationships between the projection operator of a clean image and of its adversarially perturbed version, via bounds relating geodesic distance on the Grassmannian to matrix Frobenius norms. Unlike state-of-the-art approaches, even without any retraining, the proposed strategy achieves an absolute improvement of 4.5% in defense accuracy on ImageNet.
arXiv Detail & Related papers (2020-05-06T16:17:12Z)
Deflecting Adversarial Attacks [94.85315681223702]
We present a new approach towards ending this cycle where we "deflect" adversarial attacks by causing the attacker to produce an input that resembles the attack's target class. We first propose a stronger defense based on Capsule Networks that combines three detection mechanisms to achieve state-of-the-art detection performance.
arXiv Detail & Related papers (2020-02-18T06:59:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.