Immune Defense: A Novel Adversarial Defense Mechanism for Preventing the
Generation of Adversarial Examples
- URL: http://arxiv.org/abs/2303.04502v1
- Date: Wed, 8 Mar 2023 10:47:17 GMT
- Title: Immune Defense: A Novel Adversarial Defense Mechanism for Preventing the
Generation of Adversarial Examples
- Authors: Jinwei Wang, Hao Wu, Haihua Wang, Jiawei Zhang, Xiangyang Luo, Bin Ma
- Abstract summary: The vulnerability of Deep Neural Networks (DNNs) to adversarial examples has been confirmed.
We propose a novel adversarial defense mechanism, which is referred to as immune defense.
This mechanism applies carefully designed quasi-imperceptible perturbations to the raw images to prevent the generation of adversarial examples.
- Score: 32.649613813876954
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The vulnerability of Deep Neural Networks (DNNs) to adversarial examples has
been confirmed. Existing adversarial defenses primarily aim at preventing
adversarial examples from attacking DNNs successfully, rather than preventing
their generation. If the generation of adversarial examples is unregulated,
images within reach are no longer secure and pose a threat to non-robust DNNs.
Although gradient obfuscation attempts to address this issue, it has been shown
to be circumventable. Therefore, we propose a novel adversarial defense
mechanism, which is referred to as immune defense and is the example-based
pre-defense. This mechanism applies carefully designed quasi-imperceptible
perturbations to the raw images to prevent the generation of adversarial
examples for the raw images, and thereby protecting both images and DNNs. These
perturbed images are referred to as Immune Examples (IEs). In the white-box
immune defense, we provide a gradient-based and an optimization-based approach,
respectively. Additionally, the more complex black-box immune defense is taken
into consideration. We propose Masked Gradient Sign Descent (MGSD) to reduce
approximation error and stabilize the update to improve the transferability of
IEs and thereby ensure their effectiveness against black-box adversarial
attacks. The experimental results demonstrate that the optimization-based
approach has superior performance and better visual quality in white-box immune
defense. In contrast, the gradient-based approach has stronger transferability
and the proposed MGSD significantly improve the transferability of baselines.
Related papers
- Privacy-preserving Universal Adversarial Defense for Black-box Models [20.968518031455503]
We introduce DUCD, a universal black-box defense method that does not require access to the target model's parameters or architecture.
Our approach involves querying the target model by querying it with data, creating a white-box surrogate while preserving data privacy.
Experiments on multiple image classification datasets show that DUCD not only outperforms existing black-box defenses but also matches the accuracy of white-box defenses.
arXiv Detail & Related papers (2024-08-20T08:40:39Z) - Improving Adversarial Robustness via Decoupled Visual Representation Masking [65.73203518658224]
In this paper, we highlight two novel properties of robust features from the feature distribution perspective.
We find that state-of-the-art defense methods aim to address both of these mentioned issues well.
Specifically, we propose a simple but effective defense based on decoupled visual representation masking.
arXiv Detail & Related papers (2024-06-16T13:29:41Z) - Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - Guidance Through Surrogate: Towards a Generic Diagnostic Attack [101.36906370355435]
We develop a guided mechanism to avoid local minima during attack optimization, leading to a novel attack dubbed Guided Projected Gradient Attack (G-PGA)
Our modified attack does not require random restarts, large number of attack iterations or search for an optimal step-size.
More than an effective attack, G-PGA can be used as a diagnostic tool to reveal elusive robustness due to gradient masking in adversarial defenses.
arXiv Detail & Related papers (2022-12-30T18:45:23Z) - Adversarial example generation with AdaBelief Optimizer and Crop
Invariance [8.404340557720436]
Adversarial attacks can be an important method to evaluate and select robust models in safety-critical applications.
We propose AdaBelief Iterative Fast Gradient Method (ABI-FGM) and Crop-Invariant attack Method (CIM) to improve the transferability of adversarial examples.
Our method has higher success rates than state-of-the-art gradient-based attack methods.
arXiv Detail & Related papers (2021-02-07T06:00:36Z) - Error Diffusion Halftoning Against Adversarial Examples [85.11649974840758]
Adversarial examples contain carefully crafted perturbations that can fool deep neural networks into making wrong predictions.
We propose a new image transformation defense based on error diffusion halftoning, and combine it with adversarial training to defend against adversarial examples.
arXiv Detail & Related papers (2021-01-23T07:55:02Z) - Online Alternate Generator against Adversarial Attacks [144.45529828523408]
Deep learning models are notoriously sensitive to adversarial examples which are synthesized by adding quasi-perceptible noises on real images.
We propose a portable defense method, online alternate generator, which does not need to access or modify the parameters of the target networks.
The proposed method works by online synthesizing another image from scratch for an input image, instead of removing or destroying adversarial noises.
arXiv Detail & Related papers (2020-09-17T07:11:16Z) - AdvFoolGen: Creating Persistent Troubles for Deep Classifiers [17.709146615433458]
We present a new black-box attack termed AdvFoolGen, which can generate attacking images from the same feature space as that of the natural images.
We demonstrate the effectiveness and robustness of our attack in the face of state-of-the-art defense techniques.
arXiv Detail & Related papers (2020-07-20T21:27:41Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.