Generalizing Universal Adversarial Attacks Beyond Additive Perturbations
- URL: http://arxiv.org/abs/2010.07788v2
- Date: Thu, 29 Oct 2020 18:20:21 GMT
- Title: Generalizing Universal Adversarial Attacks Beyond Additive Perturbations
- Authors: Yanghao Zhang, Wenjie Ruan, Fu Wang, Xiaowei Huang
- Abstract summary: We show that a universal adversarial attack can also be achieved via non-additive perturbation.
We propose a novel unified yet flexible framework for universal adversarial attacks, called GUAP.
Experiments are conducted on CIFAR-10 and ImageNet datasets with six deep neural network models.
- Score: 8.72462752199025
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The previous study has shown that universal adversarial attacks can fool deep
neural networks over a large set of input images with a single human-invisible
perturbation. However, current methods for universal adversarial attacks are
based on additive perturbation, which cause misclassification when the
perturbation is directly added to the input images. In this paper, for the
first time, we show that a universal adversarial attack can also be achieved
via non-additive perturbation (e.g., spatial transformation). More importantly,
to unify both additive and non-additive perturbations, we propose a novel
unified yet flexible framework for universal adversarial attacks, called GUAP,
which is able to initiate attacks by additive perturbation, non-additive
perturbation, or the combination of both. Extensive experiments are conducted
on CIFAR-10 and ImageNet datasets with six deep neural network models including
GoogleLeNet, VGG16/19, ResNet101/152, and DenseNet121. The empirical
experiments demonstrate that GUAP can obtain up to 90.9% and 99.24% successful
attack rates on CIFAR-10 and ImageNet datasets, leading to over 15% and 19%
improvements respectively than current state-of-the-art universal adversarial
attacks. The code for reproducing the experiments in this paper is available at
https://github.com/TrustAI/GUAP.
Related papers
- Imperceptible Adversarial Attack via Invertible Neural Networks [9.190559753030001]
We introduce a novel Adversarial Attack via Invertible Neural Networks (AdvINN) method to produce robust and imperceptible adversarial examples.
Experiments on CIFAR-10, CIFAR-100, and ImageNet-1K demonstrate that the proposed AdvINN method can produce less imperceptible adversarial images.
arXiv Detail & Related papers (2022-11-28T03:29:39Z) - Enhancing the Self-Universality for Transferable Targeted Attacks [88.6081640779354]
Our new attack method is proposed based on the observation that highly universal adversarial perturbations tend to be more transferable for targeted attacks.
Instead of optimizing the perturbations on different images, optimizing on different regions to achieve self-universality can get rid of using extra data.
With the feature similarity loss, our method makes the features from adversarial perturbations to be more dominant than that of benign images.
arXiv Detail & Related papers (2022-09-08T11:21:26Z) - Guided Diffusion Model for Adversarial Purification [103.4596751105955]
Adversarial attacks disturb deep neural networks (DNNs) in various algorithms and frameworks.
We propose a novel purification approach, referred to as guided diffusion model for purification (GDMP)
On our comprehensive experiments across various datasets, the proposed GDMP is shown to reduce the perturbations raised by adversarial attacks to a shallow range.
arXiv Detail & Related papers (2022-05-30T10:11:15Z) - TnT Attacks! Universal Naturalistic Adversarial Patches Against Deep
Neural Network Systems [15.982408142401072]
Deep neural networks are vulnerable to attacks from adversarial inputs and, more recently, Trojans to misguide or hijack the decision of the model.
A TnT is universal because any input image captured with a TnT in the scene will: i) misguide a network (untargeted attack); or ii) force the network to make a malicious decision.
We show a generalization of the attack to create patches achieving higher attack success rates than existing state-of-the-art methods.
arXiv Detail & Related papers (2021-11-19T01:35:10Z) - Real-time Detection of Practical Universal Adversarial Perturbations [3.806971160251168]
Universal Adversarial Perturbations (UAPs) enable physically realizable and robust attacks against Deep Neural Networks (DNNs)
In this paper we propose HyperNeuron, an efficient and scalable algorithm that allows for the real-time detection of UAPs.
arXiv Detail & Related papers (2021-05-16T03:01:29Z) - Combating Adversaries with Anti-Adversaries [118.70141983415445]
In particular, our layer generates an input perturbation in the opposite direction of the adversarial one.
We verify the effectiveness of our approach by combining our layer with both nominally and robustly trained models.
Our anti-adversary layer significantly enhances model robustness while coming at no cost on clean accuracy.
arXiv Detail & Related papers (2021-03-26T09:36:59Z) - A Survey On Universal Adversarial Attack [68.1815935074054]
Deep neural networks (DNNs) have demonstrated remarkable performance for various applications.
They are widely known to be vulnerable to the attack of adversarial perturbations.
Universal adversarial perturbations (UAPs) fool the target DNN for most images.
arXiv Detail & Related papers (2021-03-02T06:35:09Z) - Double Targeted Universal Adversarial Perturbations [83.60161052867534]
We introduce a double targeted universal adversarial perturbations (DT-UAPs) to bridge the gap between the instance-discriminative image-dependent perturbations and the generic universal perturbations.
We show the effectiveness of the proposed DTA algorithm on a wide range of datasets and also demonstrate its potential as a physical attack.
arXiv Detail & Related papers (2020-10-07T09:08:51Z) - Evading Deepfake-Image Detectors with White- and Black-Box Attacks [75.13740810603686]
We show that a popular forensic approach trains a neural network to distinguish real from synthetic content.
We develop five attack case studies on a state-of-the-art classifier that achieves an area under the ROC curve (AUC) of 0.95 on almost all existing image generators.
We also develop a black-box attack that, with no access to the target classifier, reduces the AUC to 0.22.
arXiv Detail & Related papers (2020-04-01T17:59:59Z) - Adversarial Attacks on Convolutional Neural Networks in Facial
Recognition Domain [2.4704085162861693]
Adversarial attacks that render Deep Neural Network (DNN) classifiers vulnerable in real life represent a serious threat in autonomous vehicles, malware filters, or biometric authentication systems.
We apply Fast Gradient Sign Method to introduce perturbations to a facial image dataset and then test the output on a different classifier.
We craft a variety of different black-box attack algorithms on a facial image dataset assuming minimal adversarial knowledge.
arXiv Detail & Related papers (2020-01-30T00:25:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.