D-square-B: Deep Distribution Bound for Natural-looking Adversarial
Attack
- URL: http://arxiv.org/abs/2006.07258v2
- Date: Sat, 16 Jan 2021 23:46:39 GMT
- Title: D-square-B: Deep Distribution Bound for Natural-looking Adversarial
Attack
- Authors: Qiuling Xu, Guanhong Tao and Xiangyu Zhang
- Abstract summary: We propose a novel technique that can generate natural-looking adversarial examples by bounding variations induced for internal activation values in some deep layer(s)
By bounding model internals instead of individual pixels, our attack admits perturbations closely coupled with existing features of the original input.
Our attack can achieve the same attack success/confidence level while having much more natural-looking adversarial perturbations.
- Score: 19.368450129985423
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel technique that can generate natural-looking adversarial
examples by bounding the variations induced for internal activation values in
some deep layer(s), through a distribution quantile bound and a polynomial
barrier loss function. By bounding model internals instead of individual
pixels, our attack admits perturbations closely coupled with the existing
features of the original input, allowing the generated examples to be
natural-looking while having diverse and often substantial pixel distances from
the original input. Enforcing per-neuron distribution quantile bounds allows
addressing the non-uniformity of internal activation values. Our evaluation on
ImageNet and five different model architecture demonstrates that our attack is
quite effective. Compared to the state-of-the-art pixel space attack, semantic
attack, and feature space attack, our attack can achieve the same attack
success/confidence level while having much more natural-looking adversarial
perturbations. These perturbations piggy-back on existing local features and do
not have any fixed pixel bounds.
Related papers
- IRAD: Implicit Representation-driven Image Resampling against Adversarial Attacks [16.577595936609665]
We introduce a novel approach to counter adversarial attacks, namely, image resampling.
Image resampling transforms a discrete image into a new one, simulating the process of scene recapturing or rerendering as specified by a geometrical transformation.
We show that our method significantly enhances the adversarial robustness of diverse deep models against various attacks while maintaining high accuracy on clean images.
arXiv Detail & Related papers (2023-10-18T11:19:32Z) - LEAT: Towards Robust Deepfake Disruption in Real-World Scenarios via
Latent Ensemble Attack [11.764601181046496]
Deepfakes, malicious visual contents created by generative models, pose an increasingly harmful threat to society.
To proactively mitigate deepfake damages, recent studies have employed adversarial perturbation to disrupt deepfake model outputs.
We propose a simple yet effective disruption method called Latent Ensemble ATtack (LEAT), which attacks the independent latent encoding process.
arXiv Detail & Related papers (2023-07-04T07:00:37Z) - Content-based Unrestricted Adversarial Attack [53.181920529225906]
We propose a novel unrestricted attack framework called Content-based Unrestricted Adversarial Attack.
By leveraging a low-dimensional manifold that represents natural images, we map the images onto the manifold and optimize them along its adversarial direction.
arXiv Detail & Related papers (2023-05-18T02:57:43Z) - AdvART: Adversarial Art for Camouflaged Object Detection Attacks [7.7889972735711925]
We propose a novel approach to generate naturalistic and inconspicuous adversarial patches.
Our technique is based on directly manipulating the pixel values in the patch, which gives higher flexibility and larger space.
Our attack achieves superior success rate of up to 91.19% and 72%, respectively, in the digital world and when deployed in smart cameras at the edge.
arXiv Detail & Related papers (2023-03-03T06:28:05Z) - Leveraging Local Patch Differences in Multi-Object Scenes for Generative
Adversarial Attacks [48.66027897216473]
We tackle a more practical problem of generating adversarial perturbations using multi-object (i.e., multiple dominant objects) images.
We propose a novel generative attack (called Local Patch Difference or LPD-Attack) where a novel contrastive loss function uses the aforesaid local differences in feature space of multi-object scenes.
Our approach outperforms baseline generative attacks with highly transferable perturbations when evaluated under different white-box and black-box settings.
arXiv Detail & Related papers (2022-09-20T17:36:32Z) - Auto-regressive Image Synthesis with Integrated Quantization [55.51231796778219]
This paper presents a versatile framework for conditional image generation.
It incorporates the inductive bias of CNNs and powerful sequence modeling of auto-regression.
Our method achieves superior diverse image generation performance as compared with the state-of-the-art.
arXiv Detail & Related papers (2022-07-21T22:19:17Z) - Attribute-Guided Adversarial Training for Robustness to Natural
Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space.
Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z) - Patch-wise Attack for Fooling Deep Neural Network [153.59832333877543]
We propose a patch-wise iterative algorithm -- a black-box attack towards mainstream normally trained and defense models.
We significantly improve the success rate by 9.2% for defense models and 3.7% for normally trained models on average.
arXiv Detail & Related papers (2020-07-14T01:50:22Z) - Towards Feature Space Adversarial Attack [18.874224858723494]
We propose a new adversarial attack to Deep Neural Networks for image classification.
Our attack focuses on perturbing abstract features, more specifically, features that denote styles.
We show that our attack can generate adversarial samples that are more natural-looking than the state-of-the-art attacks.
arXiv Detail & Related papers (2020-04-26T13:56:31Z) - Learning to Manipulate Individual Objects in an Image [71.55005356240761]
We describe a method to train a generative model with latent factors that are independent and localized.
This means that perturbing the latent variables affects only local regions of the synthesized image, corresponding to objects.
Unlike other unsupervised generative models, ours enables object-centric manipulation, without requiring object-level annotations.
arXiv Detail & Related papers (2020-04-11T21:50:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.