Likelihood Landscapes: A Unifying Principle Behind Many Adversarial
Defenses
- URL: http://arxiv.org/abs/2008.11300v1
- Date: Tue, 25 Aug 2020 22:51:51 GMT
- Title: Likelihood Landscapes: A Unifying Principle Behind Many Adversarial
Defenses
- Authors: Fu Lin, Rohit Mittapalli, Prithvijit Chattopadhyay, Daniel Bolya, Judy
Hoffman
- Abstract summary: We investigate the potential effect defense techniques have on the geometry of the likelihood landscape.
A subset of adversarial defense techniques results in a similar effect of flattening the likelihood landscape.
- Score: 15.629921195632857
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Convolutional Neural Networks have been shown to be vulnerable to adversarial
examples, which are known to locate in subspaces close to where normal data
lies but are not naturally occurring and of low probability. In this work, we
investigate the potential effect defense techniques have on the geometry of the
likelihood landscape - likelihood of the input images under the trained model.
We first propose a way to visualize the likelihood landscape leveraging an
energy-based model interpretation of discriminative classifiers. Then we
introduce a measure to quantify the flatness of the likelihood landscape. We
observe that a subset of adversarial defense techniques results in a similar
effect of flattening the likelihood landscape. We further explore directly
regularizing towards a flat landscape for adversarial robustness.
Related papers
- Hide in Thicket: Generating Imperceptible and Rational Adversarial
Perturbations on 3D Point Clouds [62.94859179323329]
Adrial attack methods based on point manipulation for 3D point cloud classification have revealed the fragility of 3D models.
We propose a novel shape-based adversarial attack method, HiT-ADV, which conducts a two-stage search for attack regions based on saliency and imperceptibility perturbation scores.
We propose that by employing benign resampling and benign rigid transformations, we can further enhance physical adversarial strength with little sacrifice to imperceptibility.
arXiv Detail & Related papers (2024-03-08T12:08:06Z) - A Survey on Transferability of Adversarial Examples across Deep Neural Networks [53.04734042366312]
adversarial examples can manipulate machine learning models into making erroneous predictions.
The transferability of adversarial examples enables black-box attacks which circumvent the need for detailed knowledge of the target model.
This survey explores the landscape of the adversarial transferability of adversarial examples.
arXiv Detail & Related papers (2023-10-26T17:45:26Z) - LatentForensics: Towards frugal deepfake detection in the StyleGAN latent space [2.629091178090276]
We propose a deepfake detection method that operates in the latent space of a state-of-the-art generative adversarial network (GAN) trained on high-quality face images.
Experimental results on standard datasets reveal that the proposed approach outperforms other state-of-the-art deepfake classification methods.
arXiv Detail & Related papers (2023-03-30T08:36:48Z) - Transferable Physical Attack against Object Detection with Separable
Attention [14.805375472459728]
Transferable adversarial attack is always in the spotlight since deep learning models have been demonstrated to be vulnerable to adversarial samples.
In this paper, we put forward a novel method of generating physically realizable adversarial camouflage to achieve transferable attack against detection models.
arXiv Detail & Related papers (2022-05-19T14:34:55Z) - Attack to Fool and Explain Deep Networks [59.97135687719244]
We counter-argue by providing evidence of human-meaningful patterns in adversarial perturbations.
Our major contribution is a novel pragmatic adversarial attack that is subsequently transformed into a tool to interpret the visual models.
arXiv Detail & Related papers (2021-06-20T03:07:36Z) - Where and What? Examining Interpretable Disentangled Representations [96.32813624341833]
Capturing interpretable variations has long been one of the goals in disentanglement learning.
Unlike the independence assumption, interpretability has rarely been exploited to encourage disentanglement in the unsupervised setting.
In this paper, we examine the interpretability of disentangled representations by investigating two questions: where to be interpreted and what to be interpreted.
arXiv Detail & Related papers (2021-04-07T11:22:02Z) - Spatially Correlated Patterns in Adversarial Images [5.069312274160184]
Adversarial attacks have proved to be the major impediment in the progress on research towards reliable machine learning solutions.
We propose a framework for segregating and isolating regions within an input image which are critical towards either classification (during inference), or adversarial vulnerability or both.
arXiv Detail & Related papers (2020-11-21T14:06:59Z) - Adversarial Patch Attacks on Monocular Depth Estimation Networks [7.089737454146505]
We propose a method of adversarial patch attack on monocular depth estimation.
We generate artificial patterns that can fool the target methods into estimating an incorrect depth for the regions where the patterns are placed.
Our method can be implemented in the real world by physically placing the printed patterns in real scenes.
arXiv Detail & Related papers (2020-10-06T22:56:22Z) - Face Anti-Spoofing Via Disentangled Representation Learning [90.90512800361742]
Face anti-spoofing is crucial to security of face recognition systems.
We propose a novel perspective of face anti-spoofing that disentangles the liveness features and content features from images.
arXiv Detail & Related papers (2020-08-19T03:54:23Z) - Targeted Adversarial Perturbations for Monocular Depth Prediction [74.61708733460927]
We study the effect of adversarial perturbations on the task of monocular depth prediction.
Specifically, we explore the ability of small, imperceptible additive perturbations to selectively alter the perceived geometry of the scene.
We show that such perturbations can not only globally re-scale the predicted distances from the camera, but also alter the prediction to match a different target scene.
arXiv Detail & Related papers (2020-06-12T19:29:43Z) - Towards Feature Space Adversarial Attack [18.874224858723494]
We propose a new adversarial attack to Deep Neural Networks for image classification.
Our attack focuses on perturbing abstract features, more specifically, features that denote styles.
We show that our attack can generate adversarial samples that are more natural-looking than the state-of-the-art attacks.
arXiv Detail & Related papers (2020-04-26T13:56:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.