Transferring Adversarial Robustness Through Robust Representation
Matching
- URL: http://arxiv.org/abs/2202.09994v1
- Date: Mon, 21 Feb 2022 05:15:40 GMT
- Title: Transferring Adversarial Robustness Through Robust Representation
Matching
- Authors: Pratik Vaishnavi, Kevin Eykholt, Amir Rahmati
- Abstract summary: Adrial training is one of the few known defenses able to reliably withstand such attacks against neural networks.
We propose Robust Representation Matching (RRM), a low-cost method to transfer the robustness of an adversarially trained model to a new model.
RRM is superior with respect to both model performance and adversarial training time.
- Score: 3.5934248574481717
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the widespread use of machine learning, concerns over its security and
reliability have become prevalent. As such, many have developed defenses to
harden neural networks against adversarial examples, imperceptibly perturbed
inputs that are reliably misclassified. Adversarial training in which
adversarial examples are generated and used during training is one of the few
known defenses able to reliably withstand such attacks against neural networks.
However, adversarial training imposes a significant training overhead and
scales poorly with model complexity and input dimension. In this paper, we
propose Robust Representation Matching (RRM), a low-cost method to transfer the
robustness of an adversarially trained model to a new model being trained for
the same task irrespective of architectural differences. Inspired by
student-teacher learning, our method introduces a novel training loss that
encourages the student to learn the teacher's robust representations. Compared
to prior works, RRM is superior with respect to both model performance and
adversarial training time. On CIFAR-10, RRM trains a robust model $\sim
1.8\times$ faster than the state-of-the-art. Furthermore, RRM remains effective
on higher-dimensional datasets. On Restricted-ImageNet, RRM trains a ResNet50
model $\sim 18\times$ faster than standard adversarial training.
Related papers
- Pruning Adversarially Robust Neural Networks without Adversarial
Examples [27.952904247130263]
We propose a novel framework to prune a robust neural network while maintaining adversarial robustness.
We leverage concurrent self-distillation and pruning to preserve knowledge in the original model as well as regularizing the pruned model via the Hilbert-Schmidt Information Bottleneck.
arXiv Detail & Related papers (2022-10-09T17:48:50Z) - Two Heads are Better than One: Robust Learning Meets Multi-branch Models [14.72099568017039]
We propose Branch Orthogonality adveRsarial Training (BORT) to obtain state-of-the-art performance with solely the original dataset for adversarial training.
We evaluate our approach on CIFAR-10, CIFAR-100, and SVHN against ell_infty norm-bounded perturbations of size epsilon = 8/255, respectively.
arXiv Detail & Related papers (2022-08-17T05:42:59Z) - Sparsity Winning Twice: Better Robust Generalization from More Efficient
Training [94.92954973680914]
We introduce two alternatives for sparse adversarial training: (i) static sparsity and (ii) dynamic sparsity.
We find both methods to yield win-win: substantially shrinking the robust generalization gap and alleviating the robust overfitting.
Our approaches can be combined with existing regularizers, establishing new state-of-the-art results in adversarial training.
arXiv Detail & Related papers (2022-02-20T15:52:08Z) - Mutual Adversarial Training: Learning together is better than going
alone [82.78852509965547]
We study how interactions among models affect robustness via knowledge distillation.
We propose mutual adversarial training (MAT) in which multiple models are trained together.
MAT can effectively improve model robustness and outperform state-of-the-art methods under white-box attacks.
arXiv Detail & Related papers (2021-12-09T15:59:42Z) - $\ell_\infty$-Robustness and Beyond: Unleashing Efficient Adversarial
Training [11.241749205970253]
We show how selecting a small subset of training data provides a more principled approach towards reducing the time complexity of robust training.
Our approach speeds up adversarial training by 2-3 times, while experiencing a small reduction in the clean and robust accuracy.
arXiv Detail & Related papers (2021-12-01T09:55:01Z) - A Simple Fine-tuning Is All You Need: Towards Robust Deep Learning Via
Adversarial Fine-tuning [90.44219200633286]
We propose a simple yet very effective adversarial fine-tuning approach based on a $textitslow start, fast decay$ learning rate scheduling strategy.
Experimental results show that the proposed adversarial fine-tuning approach outperforms the state-of-the-art methods on CIFAR-10, CIFAR-100 and ImageNet datasets.
arXiv Detail & Related papers (2020-12-25T20:50:15Z) - Self-Progressing Robust Training [146.8337017922058]
Current robust training methods such as adversarial training explicitly uses an "attack" to generate adversarial examples.
We propose a new framework called SPROUT, self-progressing robust training.
Our results shed new light on scalable, effective and attack-independent robust training methods.
arXiv Detail & Related papers (2020-12-22T00:45:24Z) - To be Robust or to be Fair: Towards Fairness in Adversarial Training [83.42241071662897]
We find that adversarial training algorithms tend to introduce severe disparity of accuracy and robustness between different groups of data.
We propose a Fair-Robust-Learning (FRL) framework to mitigate this unfairness problem when doing adversarial defenses.
arXiv Detail & Related papers (2020-10-13T02:21:54Z) - Adversarial Training with Stochastic Weight Average [4.633908654744751]
Adrial training deep neural networks often experience serious overfitting problem.
In traditional machine learning, one way to relieve overfitting from the lack of data is to use ensemble methods.
In this paper, we propose adversarial training with weight average (SWA)
While performing adversarial training, we aggregate the temporal weight states in the trajectory of training.
arXiv Detail & Related papers (2020-09-21T04:47:20Z) - Fast is better than free: Revisiting adversarial training [86.11788847990783]
We show that it is possible to train empirically robust models using a much weaker and cheaper adversary.
We identify a failure mode referred to as "catastrophic overfitting" which may have caused previous attempts to use FGSM adversarial training to fail.
arXiv Detail & Related papers (2020-01-12T20:30:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.