Related papers: Adversarial robustness against multiple $l_p$-threat models at the price of one and how to quickly fine-tune robust models to another threat model

Adversarial robustness against multiple $l_p$-threat models at the price of one and how to quickly fine-tune robust models to another threat model

URL: http://arxiv.org/abs/2105.12508v1
Date: Wed, 26 May 2021 12:20:47 GMT
Title: Adversarial robustness against multiple $l_p$-threat models at the price of one and how to quickly fine-tune robust models to another threat model
Authors: Francesco Croce, Matthias Hein
Abstract summary: Adrial training (AT) in order to achieve adversarial robustness wrt single $l_p$-threat models has been discussed extensively. In this paper we develop a simple and efficient training scheme to achieve adversarial robustness against the union of $l_p$-threat models.
Score: 79.05253587566197
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial training (AT) in order to achieve adversarial robustness wrt single $l_p$-threat models has been discussed extensively. However, for safety-critical systems adversarial robustness should be achieved wrt all $l_p$-threat models simultaneously. In this paper we develop a simple and efficient training scheme to achieve adversarial robustness against the union of $l_p$-threat models. Our novel $l_1+l_\infty$-AT scheme is based on geometric considerations of the different $l_p$-balls and costs as much as normal adversarial training against a single $l_p$-threat model. Moreover, we show that using our $l_1+l_\infty$-AT scheme one can fine-tune with just 3 epochs any $l_p$-robust model (for $p \in \{1,2,\infty\}$) and achieve multiple norm adversarial robustness. In this way we boost the previous state-of-the-art reported for multiple-norm robustness by more than $6\%$ on CIFAR-10 and report up to our knowledge the first ImageNet models with multiple norm robustness. Moreover, we study the general transfer of adversarial robustness between different threat models and in this way boost the previous SOTA $l_1$-robustness on CIFAR-10 by almost $10\%$.

Related papers

Towards Universal Certified Robustness with Multi-Norm Training [4.188296977882316]
Existing certified training methods can only train models to be robust against a certain perturbation type. We propose the first multi-norm certified training framework textbfCURE, consisting of a new $l$ deterministic certified training defense. Compared with SOTA certified training, textbfCURE improves union robustness up to $22.8% on MNIST, $23.9% on CIFAR-10, and $8.0%$ on TinyImagenet.
arXiv Detail & Related papers (2024-10-03T21:20:46Z)
Deep Adversarial Defense Against Multilevel-Lp Attacks [5.604868766260297]
This paper introduces a computationally efficient multilevel $ell_p$ defense, called the Efficient Robust Mode Connectivity (EMRC) method. Similar to analytical continuation approaches used in continuous optimization, the method blends two $p$-specific adversarially optimal models. We present experiments demonstrating that our approach performs better on various attacks as compared to AT-$ell_infty$, E-AT, and MSD.
arXiv Detail & Related papers (2024-07-12T13:30:00Z)
RAMP: Boosting Adversarial Robustness Against Multiple $l_p$ Perturbations for Universal Robustness [4.188296977882316]
We propose a novel training framework textbfRAMP, to boost the robustness against multiple $l_p$ perturbations. For training from scratch, textbfRAMP achieves a union accuracy of $44.6%$ and good clean accuracy of $81.2%$ on ResNet-18 against AutoAttack on CIFAR-10.
arXiv Detail & Related papers (2024-02-09T23:29:54Z)
Better Diffusion Models Further Improve Adversarial Training [97.44991845907708]
It has been recognized that the data generated by the diffusion probabilistic model (DDPM) improves adversarial training. This paper gives an affirmative answer by employing the most recent diffusion model which has higher efficiency. Our adversarially trained models achieve state-of-the-art performance on RobustBench using only generated data.
arXiv Detail & Related papers (2023-02-09T13:46:42Z)
PFGM++: Unlocking the Potential of Physics-Inspired Generative Models [14.708385906024546]
We introduce a new family of physics-inspired generative models termed PFGM++. These models realize generative trajectories for $N$ dimensional data by embedding paths in $N+D$ dimensional space. We show that models with finite $D$ can be superior to previous state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-02-08T18:58:02Z)
Towards Alternative Techniques for Improving Adversarial Robustness: Analysis of Adversarial Training at a Spectrum of Perturbations [5.18694590238069]
Adversarial training (AT) and its variants have spearheaded progress in improving neural network robustness to adversarial perturbations. We focus on models, trained on a spectrum of $epsilon$ values. We identify alternative improvements to AT that otherwise wouldn't have been apparent at a single $epsilon$.
arXiv Detail & Related papers (2022-06-13T22:01:21Z)
Mutual Adversarial Training: Learning together is better than going alone [82.78852509965547]
We study how interactions among models affect robustness via knowledge distillation. We propose mutual adversarial training (MAT) in which multiple models are trained together. MAT can effectively improve model robustness and outperform state-of-the-art methods under white-box attacks.
arXiv Detail & Related papers (2021-12-09T15:59:42Z)
Exploring Sparse Expert Models and Beyond [51.90860155810848]
Mixture-of-Experts (MoE) models can achieve promising results with outrageous large amount of parameters but constant computation cost. We propose a simple method called expert prototyping that splits experts into different prototypes and applies $k$ top-$1$ routing. This strategy improves the model quality but maintains constant computational costs, and our further exploration on extremely large-scale models reflects that it is more effective in training larger models.
arXiv Detail & Related papers (2021-05-31T16:12:44Z)
Mind the box: $l_1$-APGD for sparse adversarial attacks on image classifiers [61.46999584579775]
We study the expected sparsity of the steepest descent step for this effective threat model. We propose an adaptive form of PGD which is highly effective even with a small budget of iterations.
arXiv Detail & Related papers (2021-03-01T18:53:32Z)
Improving Robustness and Generality of NLP Models Using Disentangled Representations [62.08794500431367]
Supervised neural networks first map an input $x$ to a single representation $z$, and then map $z$ to the output label $y$. We present methods to improve robustness and generality of NLP models from the standpoint of disentangled representation learning. We show that models trained with the proposed criteria provide better robustness and domain adaptation ability in a wide range of supervised learning tasks.
arXiv Detail & Related papers (2020-09-21T02:48:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.