Related papers: Learning Defense Transformers for Counterattacking Adversarial Examples

Learning Defense Transformers for Counterattacking Adversarial Examples

URL: http://arxiv.org/abs/2103.07595v1
Date: Sat, 13 Mar 2021 02:03:53 GMT
Title: Learning Defense Transformers for Counterattacking Adversarial Examples
Authors: Jincheng Li, Jiezhang Cao, Yifan Zhang, Jian Chen, Mingkui Tan
Abstract summary: Deep neural networks (DNNs) are vulnerable to adversarial examples with small perturbations. Existing defense methods focus on some specific types of adversarial examples and may fail to defend well in real-world applications. We study adversarial examples from a new perspective that whether we can defend against adversarial examples by pulling them back to the original clean distribution.
Score: 43.59730044883175
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks (DNNs) are vulnerable to adversarial examples with small perturbations. Adversarial defense thus has been an important means which improves the robustness of DNNs by defending against adversarial examples. Existing defense methods focus on some specific types of adversarial examples and may fail to defend well in real-world applications. In practice, we may face many types of attacks where the exact type of adversarial examples in real-world applications can be even unknown. In this paper, motivated by that adversarial examples are more likely to appear near the classification boundary, we study adversarial examples from a new perspective that whether we can defend against adversarial examples by pulling them back to the original clean distribution. We theoretically and empirically verify the existence of defense affine transformations that restore adversarial examples. Relying on this, we learn a defense transformer to counterattack the adversarial examples by parameterizing the affine transformations and exploiting the boundary information of DNNs. Extensive experiments on both toy and real-world datasets demonstrate the effectiveness and generalization of our defense transformer.

Related papers

Detecting Adversarial Examples [24.585379549997743]
We propose a novel method to detect adversarial examples by analyzing the layer outputs of Deep Neural Networks. Our method is highly effective, compatible with any DNN architecture, and applicable across different domains, such as image, video, and audio.
arXiv Detail & Related papers (2024-10-22T21:42:59Z)
The Enemy of My Enemy is My Friend: Exploring Inverse Adversaries for Improving Adversarial Training [72.39526433794707]
Adversarial training and its variants have been shown to be the most effective approaches to defend against adversarial examples. We propose a novel adversarial training scheme that encourages the model to produce similar outputs for an adversarial example and its inverse adversarial'' counterpart. Our training method achieves state-of-the-art robustness as well as natural accuracy.
arXiv Detail & Related papers (2022-11-01T15:24:26Z)
TREATED:Towards Universal Defense against Textual Adversarial Attacks [28.454310179377302]
We propose TREATED, a universal adversarial detection method that can defend against attacks of various perturbation levels without making any assumptions. Extensive experiments on three competitive neural networks and two widely used datasets show that our method achieves better detection performance than baselines.
arXiv Detail & Related papers (2021-09-13T03:31:20Z)
Towards Defending against Adversarial Examples via Attack-Invariant Features [147.85346057241605]
Deep neural networks (DNNs) are vulnerable to adversarial noise. adversarial robustness can be improved by exploiting adversarial examples. Models trained on seen types of adversarial examples generally cannot generalize well to unseen types of adversarial examples.
arXiv Detail & Related papers (2021-06-09T12:49:54Z)
Internal Wasserstein Distance for Adversarial Attack and Defense [40.27647699862274]
We propose an internal Wasserstein distance (IWD) to measure image similarity between a sample and its adversarial example. We develop a novel attack method by capturing the distribution of patches in original samples. We also build a new defense method that seeks to learn robust models to defend against unseen adversarial examples.
arXiv Detail & Related papers (2021-03-13T02:08:02Z)
Detecting Adversarial Examples by Input Transformations, Defense Perturbations, and Voting [71.57324258813674]
convolutional neural networks (CNNs) have proved to reach super-human performance in visual recognition tasks. CNNs can easily be fooled by adversarial examples, i.e., maliciously-crafted images that force the networks to predict an incorrect output. This paper extensively explores the detection of adversarial examples via image transformations and proposes a novel methodology.
arXiv Detail & Related papers (2021-01-27T14:50:41Z)
Advocating for Multiple Defense Strategies against Adversarial Examples [66.90877224665168]
It has been empirically observed that defense mechanisms designed to protect neural networks against $ell_infty$ adversarial examples offer poor performance. In this paper we conduct a geometrical analysis that validates this observation. Then, we provide a number of empirical insights to illustrate the effect of this phenomenon in practice.
arXiv Detail & Related papers (2020-12-04T14:42:46Z)
Can We Mitigate Backdoor Attack Using Adversarial Detection Methods? [26.8404758315088]
We conduct comprehensive studies on the connections between adversarial examples and backdoor examples of Deep Neural Networks. Our insights are based on the observation that both adversarial examples and backdoor examples have anomalies during the inference process. We revise four existing adversarial defense methods for detecting backdoor examples.
arXiv Detail & Related papers (2020-06-26T09:09:27Z)
A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems. This paper proposes a self-supervised adversarial training mechanism in the input space. It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.