Towards Out-of-Distribution Adversarial Robustness
- URL: http://arxiv.org/abs/2210.03150v4
- Date: Mon, 26 Jun 2023 08:18:24 GMT
- Title: Towards Out-of-Distribution Adversarial Robustness
- Authors: Adam Ibrahim, Charles Guille-Escuret, Ioannis Mitliagkas, Irina Rish,
David Krueger, Pouya Bashivan
- Abstract summary: We show that there is potential for improvement against many commonly used attacks by adopting a domain generalisation approach.
We treat each type of attack as a domain, and apply the Risk Extrapolation method (REx), which promotes similar levels of robustness against all training attacks.
Compared to existing methods, we obtain similar or superior worst-case adversarial robustness on attacks seen during training.
- Score: 18.019850207961465
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial robustness continues to be a major challenge for deep learning. A
core issue is that robustness to one type of attack often fails to transfer to
other attacks. While prior work establishes a theoretical trade-off in
robustness against different $L_p$ norms, we show that there is potential for
improvement against many commonly used attacks by adopting a domain
generalisation approach. Concretely, we treat each type of attack as a domain,
and apply the Risk Extrapolation method (REx), which promotes similar levels of
robustness against all training attacks. Compared to existing methods, we
obtain similar or superior worst-case adversarial robustness on attacks seen
during training. Moreover, we achieve superior performance on families or
tunings of attacks only encountered at test time. On ensembles of attacks, our
approach improves the accuracy from 3.4% with the best existing baseline to
25.9% on MNIST, and from 16.9% to 23.5% on CIFAR10.
Related papers
- Adapting to Evolving Adversaries with Regularized Continual Robust Training [47.93633573641843]
We present theoretical results which show that the gap in a model's robustness against different attacks is bounded by how far each attack perturbs a sample in the model's logit space.
Our findings and open-source code lay the groundwork for the deployment of models robust to evolving attacks.
arXiv Detail & Related papers (2025-02-06T17:38:41Z) - Closing the Gap: Achieving Better Accuracy-Robustness Tradeoffs against Query-Based Attacks [1.54994260281059]
We show how to efficiently establish, at test-time, a solid tradeoff between robustness and accuracy when mitigating query-based attacks.
Our approach is independent of training and supported by theory.
arXiv Detail & Related papers (2023-12-15T17:02:19Z) - MultiRobustBench: Benchmarking Robustness Against Multiple Attacks [86.70417016955459]
We present the first unified framework for considering multiple attacks against machine learning (ML) models.
Our framework is able to model different levels of learner's knowledge about the test-time adversary.
We evaluate the performance of 16 defended models for robustness against a set of 9 different attack types.
arXiv Detail & Related papers (2023-02-21T20:26:39Z) - Improving Adversarial Robustness with Self-Paced Hard-Class Pair
Reweighting [5.084323778393556]
adversarial training with untargeted attacks is one of the most recognized methods.
We find that the naturally imbalanced inter-class semantic similarity makes those hard-class pairs to become the virtual targets of each other.
We propose to upweight hard-class pair loss in model optimization, which prompts learning discriminative features from hard classes.
arXiv Detail & Related papers (2022-10-26T22:51:36Z) - Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial
Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically.
Our method learns the in adversarial attacks parameterized by a recurrent neural network.
We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z) - Regional Adversarial Training for Better Robust Generalization [35.42873777434504]
We introduce a new adversarial training framework that considers the diversity as well as characteristics of the perturbed points in the vicinity of benign samples.
RAT consistently makes significant improvement on standard adversarial training (SAT), and exhibits better robust generalization.
arXiv Detail & Related papers (2021-09-02T02:48:02Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Analysis and Applications of Class-wise Robustness in Adversarial
Training [92.08430396614273]
Adversarial training is one of the most effective approaches to improve model robustness against adversarial examples.
Previous works mainly focus on the overall robustness of the model, and the in-depth analysis on the role of each class involved in adversarial training is still missing.
We provide a detailed diagnosis of adversarial training on six benchmark datasets, i.e., MNIST, CIFAR-10, CIFAR-100, SVHN, STL-10 and ImageNet.
We observe that the stronger attack methods in adversarial learning achieve performance improvement mainly from a more successful attack on the vulnerable classes.
arXiv Detail & Related papers (2021-05-29T07:28:35Z) - Lagrangian Objective Function Leads to Improved Unforeseen Attack
Generalization in Adversarial Training [0.0]
Adversarial training (AT) has been shown effective to reach a robust model against the attack that is used during training.
We propose a simple modification to the AT that mitigates the mentioned issue.
We show that our attack is faster than other attack schemes that are designed for unseen attack generalization.
arXiv Detail & Related papers (2021-03-29T07:23:46Z) - Automated Discovery of Adaptive Attacks on Adversarial Defenses [14.633898825111826]
We present a framework that automatically discovers an effective attack on a given model with an unknown defense.
We show it outperforms AutoAttack, the current state-of-the-art tool for reliable evaluation of adversarial defenses.
arXiv Detail & Related papers (2021-02-23T18:43:24Z) - Guided Adversarial Attack for Evaluating and Enhancing Adversarial
Defenses [59.58128343334556]
We introduce a relaxation term to the standard loss, that finds more suitable gradient-directions, increases attack efficacy and leads to more efficient adversarial training.
We propose Guided Adversarial Margin Attack (GAMA), which utilizes function mapping of the clean image to guide the generation of adversaries.
We also propose Guided Adversarial Training (GAT), which achieves state-of-the-art performance amongst single-step defenses.
arXiv Detail & Related papers (2020-11-30T16:39:39Z) - Reliable evaluation of adversarial robustness with an ensemble of
diverse parameter-free attacks [65.20660287833537]
In this paper we propose two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function.
We then combine our novel attacks with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.
arXiv Detail & Related papers (2020-03-03T18:15:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.