Related papers: Perceptual Adversarial Robustness: Defense Against Unseen Threat Models

Perceptual Adversarial Robustness: Defense Against Unseen Threat Models

URL: http://arxiv.org/abs/2006.12655v4
Date: Sun, 4 Jul 2021 19:34:05 GMT
Title: Perceptual Adversarial Robustness: Defense Against Unseen Threat Models
Authors: Cassidy Laidlaw and Sahil Singla and Soheil Feizi
Abstract summary: A key challenge in adversarial robustness is the lack of a precise mathematical characterization of human perception. Under the neural perceptual threat model, we develop novel perceptual adversarial attacks and defenses. Because the NPTM is very broad, we find that Perceptual Adrial Training (PAT) against a perceptual attack gives robustness against many other types of adversarial attacks.
Score: 58.47179090632039
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A key challenge in adversarial robustness is the lack of a precise mathematical characterization of human perception, used in the very definition of adversarial attacks that are imperceptible to human eyes. Most current attacks and defenses try to avoid this issue by considering restrictive adversarial threat models such as those bounded by $L_2$ or $L_\infty$ distance, spatial perturbations, etc. However, models that are robust against any of these restrictive threat models are still fragile against other threat models. To resolve this issue, we propose adversarial training against the set of all imperceptible adversarial examples, approximated using deep neural networks. We call this threat model the neural perceptual threat model (NPTM); it includes adversarial examples with a bounded neural perceptual distance (a neural network-based approximation of the true perceptual distance) to natural images. Through an extensive perceptual study, we show that the neural perceptual distance correlates well with human judgements of perceptibility of adversarial examples, validating our threat model. Under the NPTM, we develop novel perceptual adversarial attacks and defenses. Because the NPTM is very broad, we find that Perceptual Adversarial Training (PAT) against a perceptual attack gives robustness against many other types of adversarial attacks. We test PAT on CIFAR-10 and ImageNet-100 against five diverse adversarial attacks. We find that PAT achieves state-of-the-art robustness against the union of these five attacks, more than doubling the accuracy over the next best model, without training against any of them. That is, PAT generalizes well to unforeseen perturbation types. This is vital in sensitive applications where a particular threat model cannot be assumed, and to the best of our knowledge, PAT is the first adversarial training defense with this property.

Related papers

Democratic Training Against Universal Adversarial Perturbations [7.123808749940524]
In this work, we observe that universal adversarial perturbations usually lead to abnormal entropy spectrum in hidden layers. We propose an efficient yet effective defense method for mitigating UAPs called emphDemocratic Training The results show that it effectively reduces the attack success rate, improves model robustness and preserves the model accuracy on clean samples.
arXiv Detail & Related papers (2025-02-08T12:15:32Z)
Towards Unified Robustness Against Both Backdoor and Adversarial Attacks [31.846262387360767]
Deep Neural Networks (DNNs) are known to be vulnerable to both backdoor and adversarial attacks. This paper reveals that there is an intriguing connection between backdoor and adversarial attacks. A novel Progressive Unified Defense algorithm is proposed to defend against backdoor and adversarial attacks simultaneously.
arXiv Detail & Related papers (2024-05-28T07:50:00Z)
F$^2$AT: Feature-Focusing Adversarial Training via Disentanglement of Natural and Perturbed Patterns [74.03108122774098]
Deep neural networks (DNNs) are vulnerable to adversarial examples crafted by well-designed perturbations. This could lead to disastrous results on critical applications such as self-driving cars, surveillance security, and medical diagnosis. We propose a Feature-Focusing Adversarial Training (F$2$AT) which enforces the model to focus on the core features from natural patterns.
arXiv Detail & Related papers (2023-10-23T04:31:42Z)
Improving Adversarial Robustness to Sensitivity and Invariance Attacks with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample. We use metric learning to frame adversarial regularization as an optimal transport problem. Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z)
TnT Attacks! Universal Naturalistic Adversarial Patches Against Deep Neural Network Systems [15.982408142401072]
Deep neural networks are vulnerable to attacks from adversarial inputs and, more recently, Trojans to misguide or hijack the decision of the model. A TnT is universal because any input image captured with a TnT in the scene will: i) misguide a network (untargeted attack); or ii) force the network to make a malicious decision. We show a generalization of the attack to create patches achieving higher attack success rates than existing state-of-the-art methods.
arXiv Detail & Related papers (2021-11-19T01:35:10Z)
Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications. We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths. Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z)
Towards Adversarial Patch Analysis and Certified Defense against Crowd Counting [61.99564267735242]
Crowd counting has drawn much attention due to its importance in safety-critical surveillance systems. Recent studies have demonstrated that deep neural network (DNN) methods are vulnerable to adversarial attacks. We propose a robust attack strategy called Adversarial Patch Attack with Momentum to evaluate the robustness of crowd counting models.
arXiv Detail & Related papers (2021-04-22T05:10:55Z)
A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems. This paper proposes a self-supervised adversarial training mechanism in the input space. It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)
Adversarial Feature Desensitization [12.401175943131268]
We propose a novel approach to adversarial robustness, which builds upon the insights from the domain adaptation field. Our method, called Adversarial Feature Desensitization (AFD), aims at learning features that are invariant towards adversarial perturbations of the inputs.
arXiv Detail & Related papers (2020-06-08T14:20:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.