Related papers: Towards Out-of-Distribution Adversarial Robustness

Towards Out-of-Distribution Adversarial Robustness

URL: http://arxiv.org/abs/2210.03150v4
Date: Mon, 26 Jun 2023 08:18:24 GMT
Title: Towards Out-of-Distribution Adversarial Robustness
Authors: Adam Ibrahim, Charles Guille-Escuret, Ioannis Mitliagkas, Irina Rish, David Krueger, Pouya Bashivan
Abstract summary: We show that there is potential for improvement against many commonly used attacks by adopting a domain generalisation approach. We treat each type of attack as a domain, and apply the Risk Extrapolation method (REx), which promotes similar levels of robustness against all training attacks. Compared to existing methods, we obtain similar or superior worst-case adversarial robustness on attacks seen during training.
Score: 18.019850207961465
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial robustness continues to be a major challenge for deep learning. A core issue is that robustness to one type of attack often fails to transfer to other attacks. While prior work establishes a theoretical trade-off in robustness against different $L_p$ norms, we show that there is potential for improvement against many commonly used attacks by adopting a domain generalisation approach. Concretely, we treat each type of attack as a domain, and apply the Risk Extrapolation method (REx), which promotes similar levels of robustness against all training attacks. Compared to existing methods, we obtain similar or superior worst-case adversarial robustness on attacks seen during training. Moreover, we achieve superior performance on families or tunings of attacks only encountered at test time. On ensembles of attacks, our approach improves the accuracy from 3.4% with the best existing baseline to 25.9% on MNIST, and from 16.9% to 23.5% on CIFAR10.

Related papers

Adapting to Evolving Adversaries with Regularized Continual Robust Training [47.93633573641843]
We present theoretical results which show that the gap in a model's robustness against different attacks is bounded by how far each attack perturbs a sample in the model's logit space. Our findings and open-source code lay the groundwork for the deployment of models robust to evolving attacks.
arXiv Detail & Related papers (2025-02-06T17:38:41Z)
Closing the Gap: Achieving Better Accuracy-Robustness Tradeoffs against Query-Based Attacks [1.54994260281059]
We show how to efficiently establish, at test-time, a solid tradeoff between robustness and accuracy when mitigating query-based attacks. Our approach is independent of training and supported by theory.
arXiv Detail & Related papers (2023-12-15T17:02:19Z)
MultiRobustBench: Benchmarking Robustness Against Multiple Attacks [86.70417016955459]
We present the first unified framework for considering multiple attacks against machine learning (ML) models. Our framework is able to model different levels of learner's knowledge about the test-time adversary. We evaluate the performance of 16 defended models for robustness against a set of 9 different attack types.
arXiv Detail & Related papers (2023-02-21T20:26:39Z)
Improving Adversarial Robustness with Self-Paced Hard-Class Pair Reweighting [5.084323778393556]
adversarial training with untargeted attacks is one of the most recognized methods. We find that the naturally imbalanced inter-class semantic similarity makes those hard-class pairs to become the virtual targets of each other. We propose to upweight hard-class pair loss in model optimization, which prompts learning discriminative features from hard classes.
arXiv Detail & Related papers (2022-10-26T22:51:36Z)
Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically. Our method learns the in adversarial attacks parameterized by a recurrent neural network. We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z)
Regional Adversarial Training for Better Robust Generalization [35.42873777434504]
We introduce a new adversarial training framework that considers the diversity as well as characteristics of the perturbed points in the vicinity of benign samples. RAT consistently makes significant improvement on standard adversarial training (SAT), and exhibits better robust generalization.
arXiv Detail & Related papers (2021-09-02T02:48:02Z)
Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications. We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths. Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z)
Analysis and Applications of Class-wise Robustness in Adversarial Training [92.08430396614273]
Adversarial training is one of the most effective approaches to improve model robustness against adversarial examples. Previous works mainly focus on the overall robustness of the model, and the in-depth analysis on the role of each class involved in adversarial training is still missing. We provide a detailed diagnosis of adversarial training on six benchmark datasets, i.e., MNIST, CIFAR-10, CIFAR-100, SVHN, STL-10 and ImageNet. We observe that the stronger attack methods in adversarial learning achieve performance improvement mainly from a more successful attack on the vulnerable classes.
arXiv Detail & Related papers (2021-05-29T07:28:35Z)
Lagrangian Objective Function Leads to Improved Unforeseen Attack Generalization in Adversarial Training [0.0]
Adversarial training (AT) has been shown effective to reach a robust model against the attack that is used during training. We propose a simple modification to the AT that mitigates the mentioned issue. We show that our attack is faster than other attack schemes that are designed for unseen attack generalization.
arXiv Detail & Related papers (2021-03-29T07:23:46Z)
Automated Discovery of Adaptive Attacks on Adversarial Defenses [14.633898825111826]
We present a framework that automatically discovers an effective attack on a given model with an unknown defense. We show it outperforms AutoAttack, the current state-of-the-art tool for reliable evaluation of adversarial defenses.
arXiv Detail & Related papers (2021-02-23T18:43:24Z)
Guided Adversarial Attack for Evaluating and Enhancing Adversarial Defenses [59.58128343334556]
We introduce a relaxation term to the standard loss, that finds more suitable gradient-directions, increases attack efficacy and leads to more efficient adversarial training. We propose Guided Adversarial Margin Attack (GAMA), which utilizes function mapping of the clean image to guide the generation of adversaries. We also propose Guided Adversarial Training (GAT), which achieves state-of-the-art performance amongst single-step defenses.
arXiv Detail & Related papers (2020-11-30T16:39:39Z)
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks [65.20660287833537]
In this paper we propose two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function. We then combine our novel attacks with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.
arXiv Detail & Related papers (2020-03-03T18:15:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.