Removing Adversarial Noise in Class Activation Feature Space
- URL: http://arxiv.org/abs/2104.09197v1
- Date: Mon, 19 Apr 2021 10:42:24 GMT
- Title: Removing Adversarial Noise in Class Activation Feature Space
- Authors: Dawei Zhou, Nannan Wang, Chunlei Peng, Xinbo Gao, Xiaoyu Wang, Jun Yu,
Tongliang Liu
- Abstract summary: We propose to remove adversarial noise by implementing a self-supervised adversarial training mechanism in a class activation feature space.
We train a denoising model to minimize the distances between the adversarial examples and the natural examples in the class activation feature space.
Empirical evaluations demonstrate that our method could significantly enhance adversarial robustness in comparison to previous state-of-the-art approaches.
- Score: 160.78488162713498
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNNs) are vulnerable to adversarial noise.
Preprocessing based defenses could largely remove adversarial noise by
processing inputs. However, they are typically affected by the error
amplification effect, especially in the front of continuously evolving attacks.
To solve this problem, in this paper, we propose to remove adversarial noise by
implementing a self-supervised adversarial training mechanism in a class
activation feature space. To be specific, we first maximize the disruptions to
class activation features of natural examples to craft adversarial examples.
Then, we train a denoising model to minimize the distances between the
adversarial examples and the natural examples in the class activation feature
space. Empirical evaluations demonstrate that our method could significantly
enhance adversarial robustness in comparison to previous state-of-the-art
approaches, especially against unseen adversarial attacks and adaptive attacks.
Related papers
- F$^2$AT: Feature-Focusing Adversarial Training via Disentanglement of
Natural and Perturbed Patterns [74.03108122774098]
Deep neural networks (DNNs) are vulnerable to adversarial examples crafted by well-designed perturbations.
This could lead to disastrous results on critical applications such as self-driving cars, surveillance security, and medical diagnosis.
We propose a Feature-Focusing Adversarial Training (F$2$AT) which enforces the model to focus on the core features from natural patterns.
arXiv Detail & Related papers (2023-10-23T04:31:42Z) - Enhancing Robust Representation in Adversarial Training: Alignment and
Exclusion Criteria [61.048842737581865]
We show that Adversarial Training (AT) omits to learning robust features, resulting in poor performance of adversarial robustness.
We propose a generic framework of AT to gain robust representation, by the asymmetric negative contrast and reverse attention.
Empirical evaluations on three benchmark datasets show our methods greatly advance the robustness of AT and achieve state-of-the-art performance.
arXiv Detail & Related papers (2023-10-05T07:29:29Z) - Robust Deep Learning Models Against Semantic-Preserving Adversarial
Attack [3.7264705684737893]
Deep learning models can be fooled by small $l_p$-norm adversarial perturbations and natural perturbations in terms of attributes.
We propose a novel attack mechanism named Semantic-Preserving Adversarial (SPA) attack, which can then be used to enhance adversarial training.
arXiv Detail & Related papers (2023-04-08T08:28:36Z) - Friendly Noise against Adversarial Noise: A Powerful Defense against
Data Poisoning Attacks [15.761683760167777]
A powerful category of (invisible) data poisoning attacks modify a subset of training examples by small adversarial perturbations to change the prediction of certain test-time data.
Here, we propose a highly effective approach that unlike existing methods breaks various types of invisible poisoning attacks with the slightest drop in the generalization performance.
Our approach comprises two components: an optimized friendly noise that is generated to maximally perturb examples without degrading the performance, and a randomly varying noise component.
arXiv Detail & Related papers (2022-08-14T02:41:05Z) - Perturbation Inactivation Based Adversarial Defense for Face Recognition [45.73745401760292]
Deep learning-based face recognition models are vulnerable to adversarial attacks.
A straightforward approach is to inactivate the adversarial perturbations so that they can be easily handled as general perturbations.
A plug-and-play adversarial defense method, named perturbation inactivation (PIN), is proposed to inactivate adversarial perturbations for adversarial defense.
arXiv Detail & Related papers (2022-07-13T08:33:15Z) - On Procedural Adversarial Noise Attack And Defense [2.5388455804357952]
adversarial examples would inveigle neural networks to make prediction errors with small per- turbations on the input images.
In this paper, we propose two universal adversarial perturbation (UAP) generation methods based on procedural noise functions.
Without changing the semantic representations, the adversarial examples generated via our methods show superior performance on the attack.
arXiv Detail & Related papers (2021-08-10T02:47:01Z) - Towards Defending against Adversarial Examples via Attack-Invariant
Features [147.85346057241605]
Deep neural networks (DNNs) are vulnerable to adversarial noise.
adversarial robustness can be improved by exploiting adversarial examples.
Models trained on seen types of adversarial examples generally cannot generalize well to unseen types of adversarial examples.
arXiv Detail & Related papers (2021-06-09T12:49:54Z) - Improving Adversarial Robustness via Channel-wise Activation Suppressing [65.72430571867149]
The study of adversarial examples and their activation has attracted significant attention for secure and robust learning with deep neural networks (DNNs)
In this paper, we highlight two new characteristics of adversarial examples from the channel-wise activation perspective.
We show that CAS can train a model that inherently suppresses adversarial activation, and can be easily applied to existing defense methods to further improve their robustness.
arXiv Detail & Related papers (2021-03-11T03:44:16Z) - Learning to Generate Noise for Multi-Attack Robustness [126.23656251512762]
Adversarial learning has emerged as one of the successful techniques to circumvent the susceptibility of existing methods against adversarial perturbations.
In safety-critical applications, this makes these methods extraneous as the attacker can adopt diverse adversaries to deceive the system.
We propose a novel meta-learning framework that explicitly learns to generate noise to improve the model's robustness against multiple types of attacks.
arXiv Detail & Related papers (2020-06-22T10:44:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.