RAMP: Boosting Adversarial Robustness Against Multiple $l_p$
Perturbations
- URL: http://arxiv.org/abs/2402.06827v1
- Date: Fri, 9 Feb 2024 23:29:54 GMT
- Title: RAMP: Boosting Adversarial Robustness Against Multiple $l_p$
Perturbations
- Authors: Enyi Jiang, Gagandeep Singh
- Abstract summary: We show that textbfRAMP can be easily adapted for both robust fine-tuning and full adversarial training.
For training from scratch, textbfRAMP achieves SOTA union accuracy of $44.6%$ and relatively good clean accuracy of $81.2%$ on ResNet-18 against AutoAttack.
- Score: 4.70722607350087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is considerable work on improving robustness against adversarial
attacks bounded by a single $l_p$ norm using adversarial training (AT).
However, the multiple-norm robustness (union accuracy) of AT models is still
low. We observe that simultaneously obtaining good union and clean accuracy is
hard since there are tradeoffs between robustness against multiple $l_p$
perturbations, and accuracy/robustness/efficiency. By analyzing the tradeoffs
from the lens of distribution shifts, we identify the key tradeoff pair among
$l_p$ attacks to boost efficiency and design a logit pairing loss to improve
the union accuracy. Next, we connect natural training with AT via gradient
projection, to find and incorporate useful information from natural training
into AT, which moderates the accuracy/robustness tradeoff. Combining our
contributions, we propose a framework called \textbf{RAMP}, to boost the
robustness against multiple $l_p$ perturbations. We show \textbf{RAMP} can be
easily adapted for both robust fine-tuning and full AT. For robust fine-tuning,
\textbf{RAMP} obtains a union accuracy up to $53.5\%$ on CIFAR-10, and $29.7\%$
on ImageNet. For training from scratch, \textbf{RAMP} achieves SOTA union
accuracy of $44.6\%$ and relatively good clean accuracy of $81.2\%$ on
ResNet-18 against AutoAttack on CIFAR-10.
Related papers
- Provably Adversarially Robust Nearest Prototype Classifiers [46.576144913096705]
Nearest prototypes (NPCs) assign to each input point the label of the nearest prototype with respect to a chosen distance metric.
Previous work could provide lower bounds on the minimal adversarial perturbation in the $ell_p$-threat model when using the same $ell_p$-distance for the NPCs.
In this paper we provide a complete discussion on the complexity when using $ell_p$-distances for decision and $ell_q$-threat models for certification.
arXiv Detail & Related papers (2022-07-14T21:22:30Z) - RUSH: Robust Contrastive Learning via Randomized Smoothing [31.717748554905015]
In this paper, we show a surprising fact that contrastive pre-training has an interesting yet implicit connection with robustness.
We design a powerful robust algorithm against adversarial attacks, RUSH, that combines the standard contrastive pre-training and randomized smoothing.
Our work has an improvement of over 15% in robust accuracy and a slight improvement in standard accuracy, compared to the state-of-the-arts.
arXiv Detail & Related papers (2022-07-11T18:45:14Z) - Removing Batch Normalization Boosts Adversarial Training [83.08844497295148]
Adversarial training (AT) defends deep neural networks against adversarial attacks.
A major bottleneck is the widely used batch normalization (BN), which struggles to model the different statistics of clean and adversarial training samples in AT.
Our normalizer-free robust training (NoFrost) method extends recent advances in normalizer-free networks to AT.
arXiv Detail & Related papers (2022-07-04T01:39:37Z) - Towards Alternative Techniques for Improving Adversarial Robustness:
Analysis of Adversarial Training at a Spectrum of Perturbations [5.18694590238069]
Adversarial training (AT) and its variants have spearheaded progress in improving neural network robustness to adversarial perturbations.
We focus on models, trained on a spectrum of $epsilon$ values.
We identify alternative improvements to AT that otherwise wouldn't have been apparent at a single $epsilon$.
arXiv Detail & Related papers (2022-06-13T22:01:21Z) - Mutual Adversarial Training: Learning together is better than going
alone [82.78852509965547]
We study how interactions among models affect robustness via knowledge distillation.
We propose mutual adversarial training (MAT) in which multiple models are trained together.
MAT can effectively improve model robustness and outperform state-of-the-art methods under white-box attacks.
arXiv Detail & Related papers (2021-12-09T15:59:42Z) - Subspace Adversarial Training [24.47599337641455]
We propose a new AT method, subspace adversarial training (Sub-AT), which constrains the AT in a carefully extracted subspace.
In subspace, we also allow single-step AT with larger steps and larger radius, which further improves the robustness performance.
Our pure single-step AT can reach over $mathbf51%$ robust accuracy against strong PGD-50 attack with radius $8/255$ on CIFAR-10.
arXiv Detail & Related papers (2021-11-24T02:18:37Z) - Data Augmentation Can Improve Robustness [21.485435979018256]
Adrial training suffers from robust overfitting, a phenomenon where the robust test accuracy starts to decrease during training.
We demonstrate that, when combined with model weight averaging, data augmentation can significantly boost robust accuracy.
In particular, against $ell_infty$ norm-bounded perturbations of size $epsilon = 8/255$, our model reaches 60.07% robust accuracy without using any external data.
arXiv Detail & Related papers (2021-11-09T18:57:00Z) - Robustifying $\ell_\infty$ Adversarial Training to the Union of
Perturbation Models [120.71277007016708]
We extend the capabilities of widely popular single-attack $ell_infty$ AT frameworks.
Our technique, referred to as Noise Augmented Processing (SNAP), exploits a well-established byproduct of single-attack AT frameworks.
SNAP prepends a given deep net with a shaped noise augmentation layer whose distribution is learned along with network parameters using any standard single-attack AT.
arXiv Detail & Related papers (2021-05-31T05:18:42Z) - Adversarial robustness against multiple $l_p$-threat models at the price
of one and how to quickly fine-tune robust models to another threat model [79.05253587566197]
Adrial training (AT) in order to achieve adversarial robustness wrt single $l_p$-threat models has been discussed extensively.
In this paper we develop a simple and efficient training scheme to achieve adversarial robustness against the union of $l_p$-threat models.
arXiv Detail & Related papers (2021-05-26T12:20:47Z) - Mind the box: $l_1$-APGD for sparse adversarial attacks on image
classifiers [61.46999584579775]
We study the expected sparsity of the steepest descent step for this effective threat model.
We propose an adaptive form of PGD which is highly effective even with a small budget of iterations.
arXiv Detail & Related papers (2021-03-01T18:53:32Z) - Toward Adversarial Robustness via Semi-supervised Robust Training [93.36310070269643]
Adrial examples have been shown to be the severe threat to deep neural networks (DNNs)
We propose a novel defense method, the robust training (RT), by jointly minimizing two separated risks ($R_stand$ and $R_rob$)
arXiv Detail & Related papers (2020-03-16T02:14:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.