Dual Manifold Adversarial Robustness: Defense against Lp and non-Lp
Adversarial Attacks
- URL: http://arxiv.org/abs/2009.02470v1
- Date: Sat, 5 Sep 2020 06:00:28 GMT
- Title: Dual Manifold Adversarial Robustness: Defense against Lp and non-Lp
Adversarial Attacks
- Authors: Wei-An Lin, Chun Pong Lau, Alexander Levine, Rama Chellappa, Soheil
Feizi
- Abstract summary: Adversarial training is a popular defense strategy against attack threat models with bounded Lp norms.
We propose Dual Manifold Adversarial Training (DMAT) where adversarial perturbations in both latent and image spaces are used in robustifying the model.
Our DMAT improves performance on normal images, and achieves comparable robustness to the standard adversarial training against Lp attacks.
- Score: 154.31827097264264
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training is a popular defense strategy against attack threat
models with bounded Lp norms. However, it often degrades the model performance
on normal images and the defense does not generalize well to novel attacks.
Given the success of deep generative models such as GANs and VAEs in
characterizing the underlying manifold of images, we investigate whether or not
the aforementioned problems can be remedied by exploiting the underlying
manifold information. To this end, we construct an "On-Manifold ImageNet"
(OM-ImageNet) dataset by projecting the ImageNet samples onto the manifold
learned by StyleGSN. For this dataset, the underlying manifold information is
exact. Using OM-ImageNet, we first show that adversarial training in the latent
space of images improves both standard accuracy and robustness to on-manifold
attacks. However, since no out-of-manifold perturbations are realized, the
defense can be broken by Lp adversarial attacks. We further propose Dual
Manifold Adversarial Training (DMAT) where adversarial perturbations in both
latent and image spaces are used in robustifying the model. Our DMAT improves
performance on normal images, and achieves comparable robustness to the
standard adversarial training against Lp attacks. In addition, we observe that
models defended by DMAT achieve improved robustness against novel attacks which
manipulate images by global color shifts or various types of image filtering.
Interestingly, similar improvements are also achieved when the defended models
are tested on out-of-manifold natural images. These results demonstrate the
potential benefits of using manifold information in enhancing robustness of
deep learning models against various types of novel adversarial attacks.
Related papers
- MirrorCheck: Efficient Adversarial Defense for Vision-Language Models [55.73581212134293]
We propose a novel, yet elegantly simple approach for detecting adversarial samples in Vision-Language Models.
Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs.
Empirical evaluations conducted on different datasets validate the efficacy of our approach.
arXiv Detail & Related papers (2024-06-13T15:55:04Z) - Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - AdvDiff: Generating Unrestricted Adversarial Examples using Diffusion Models [7.406040859734522]
Unrestricted adversarial attacks present a serious threat to deep learning models and adversarial defense techniques.
Previous attack methods often directly inject Projected Gradient Descent (PGD) gradients into the sampling of generative models.
We propose a new method, called AdvDiff, to generate unrestricted adversarial examples with diffusion models.
arXiv Detail & Related papers (2023-07-24T03:10:02Z) - Masked Images Are Counterfactual Samples for Robust Fine-tuning [77.82348472169335]
Fine-tuning deep learning models can lead to a trade-off between in-distribution (ID) performance and out-of-distribution (OOD) robustness.
We propose a novel fine-tuning method, which uses masked images as counterfactual samples that help improve the robustness of the fine-tuning model.
arXiv Detail & Related papers (2023-03-06T11:51:28Z) - Deep Image Destruction: A Comprehensive Study on Vulnerability of Deep
Image-to-Image Models against Adversarial Attacks [104.8737334237993]
We present comprehensive investigations into the vulnerability of deep image-to-image models to adversarial attacks.
For five popular image-to-image tasks, 16 deep models are analyzed from various standpoints.
We show that unlike in image classification tasks, the performance degradation on image-to-image tasks can largely differ depending on various factors.
arXiv Detail & Related papers (2021-04-30T14:20:33Z) - AdvHaze: Adversarial Haze Attack [19.744435173861785]
We introduce a novel adversarial attack method based on haze, which is a common phenomenon in real-world scenery.
Our method can synthesize potentially adversarial haze into an image based on the atmospheric scattering model with high realisticity.
We demonstrate that the proposed method achieves a high success rate, and holds better transferability across different classification models than the baselines.
arXiv Detail & Related papers (2021-04-28T09:52:25Z) - Stylized Adversarial Defense [105.88250594033053]
adversarial training creates perturbation patterns and includes them in the training set to robustify the model.
We propose to exploit additional information from the feature space to craft stronger adversaries.
Our adversarial training approach demonstrates strong robustness compared to state-of-the-art defenses.
arXiv Detail & Related papers (2020-07-29T08:38:10Z) - Applying Tensor Decomposition to image for Robustness against
Adversarial Attack [3.347059384111439]
It can easily fool the deep learning model by adding small perturbations.
In this paper, we suggest combining tensor decomposition for defending the model against adversarial example.
arXiv Detail & Related papers (2020-02-28T18:30:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.