Internal Wasserstein Distance for Adversarial Attack and Defense
- URL: http://arxiv.org/abs/2103.07598v1
- Date: Sat, 13 Mar 2021 02:08:02 GMT
- Title: Internal Wasserstein Distance for Adversarial Attack and Defense
- Authors: Jincheng Li, Jiezhang Cao, Shuhai Zhang, Yanwu Xu, Jian Chen, Mingkui
Tan
- Abstract summary: We propose an internal Wasserstein distance (IWD) to measure image similarity between a sample and its adversarial example.
We develop a novel attack method by capturing the distribution of patches in original samples.
We also build a new defense method that seeks to learn robust models to defend against unseen adversarial examples.
- Score: 40.27647699862274
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNNs) are vulnerable to adversarial examples that can
trigger misclassification of DNNs but may be imperceptible to human perception.
Adversarial attack has been an important way to evaluate the robustness of
DNNs. Existing attack methods on the construction of adversarial examples use
such $\ell_p$ distance as a similarity metric to perturb samples. However, this
kind of metric is incompatible with the underlying real-world image formation
and human visual perception. In this paper, we first propose an internal
Wasserstein distance (IWD) to measure image similarity between a sample and its
adversarial example. We apply IWD to perform adversarial attack and defense.
Specifically, we develop a novel attack method by capturing the distribution of
patches in original samples. In this case, our approach is able to generate
semantically similar but diverse adversarial examples that are more difficult
to defend by existing defense methods. Relying on IWD, we also build a new
defense method that seeks to learn robust models to defend against unseen
adversarial examples. We provide both thorough theoretical and empirical
evidence to support our methods.
Related papers
- Detecting Adversarial Examples [24.585379549997743]
We propose a novel method to detect adversarial examples by analyzing the layer outputs of Deep Neural Networks.
Our method is highly effective, compatible with any DNN architecture, and applicable across different domains, such as image, video, and audio.
arXiv Detail & Related papers (2024-10-22T21:42:59Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - TREATED:Towards Universal Defense against Textual Adversarial Attacks [28.454310179377302]
We propose TREATED, a universal adversarial detection method that can defend against attacks of various perturbation levels without making any assumptions.
Extensive experiments on three competitive neural networks and two widely used datasets show that our method achieves better detection performance than baselines.
arXiv Detail & Related papers (2021-09-13T03:31:20Z) - Searching for an Effective Defender: Benchmarking Defense against
Adversarial Word Substitution [83.84968082791444]
Deep neural networks are vulnerable to intentionally crafted adversarial examples.
Various methods have been proposed to defend against adversarial word-substitution attacks for neural NLP models.
arXiv Detail & Related papers (2021-08-29T08:11:36Z) - Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning [95.60856995067083]
This work is among the first to perform adversarial defense for ASV without knowing the specific attack algorithms.
We propose to perform adversarial defense from two perspectives: 1) adversarial perturbation purification and 2) adversarial perturbation detection.
Experimental results show that our detection module effectively shields the ASV by detecting adversarial samples with an accuracy of around 80%.
arXiv Detail & Related papers (2021-06-01T07:10:54Z) - Learning Defense Transformers for Counterattacking Adversarial Examples [43.59730044883175]
Deep neural networks (DNNs) are vulnerable to adversarial examples with small perturbations.
Existing defense methods focus on some specific types of adversarial examples and may fail to defend well in real-world applications.
We study adversarial examples from a new perspective that whether we can defend against adversarial examples by pulling them back to the original clean distribution.
arXiv Detail & Related papers (2021-03-13T02:03:53Z) - Advocating for Multiple Defense Strategies against Adversarial Examples [66.90877224665168]
It has been empirically observed that defense mechanisms designed to protect neural networks against $ell_infty$ adversarial examples offer poor performance.
In this paper we conduct a geometrical analysis that validates this observation.
Then, we provide a number of empirical insights to illustrate the effect of this phenomenon in practice.
arXiv Detail & Related papers (2020-12-04T14:42:46Z) - TEAM: We Need More Powerful Adversarial Examples for DNNs [6.7943676146532885]
Adversarial examples can lead to misclassification of deep neural networks (DNNs)
We propose a novel method to generate more powerful adversarial examples than previous methods.
Our method can reliably produce adversarial examples with 100% attack success rate (ASR) while only by smaller perturbations.
arXiv Detail & Related papers (2020-07-31T04:11:02Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.