Deep Repulsive Prototypes for Adversarial Robustness
- URL: http://arxiv.org/abs/2105.12427v1
- Date: Wed, 26 May 2021 09:30:28 GMT
- Title: Deep Repulsive Prototypes for Adversarial Robustness
- Authors: Alex Serban, Erik Poll and Joost Visser
- Abstract summary: We propose to train models on output spaces with large class separation in order to gain robustness without adversarial training.
We introduce a method to partition the output space into class prototypes with large separation and train models to preserve it.
Experimental results show that models trained with these prototypes gain competitive robustness with adversarial training.
- Score: 3.351714665243138
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While many defences against adversarial examples have been proposed, finding
robust machine learning models is still an open problem. The most compelling
defence to date is adversarial training and consists of complementing the
training data set with adversarial examples. Yet adversarial training severely
impacts training time and depends on finding representative adversarial
samples. In this paper we propose to train models on output spaces with large
class separation in order to gain robustness without adversarial training. We
introduce a method to partition the output space into class prototypes with
large separation and train models to preserve it. Experimental results shows
that models trained with these prototypes -- which we call deep repulsive
prototypes -- gain robustness competitive with adversarial training, while also
preserving more accuracy on natural samples. Moreover, the models are more
resilient to large perturbation sizes. For example, we obtained over 50%
robustness for CIFAR-10, with 92% accuracy on natural samples and over 20%
robustness for CIFAR-100, with 71% accuracy on natural samples without
adversarial training. For both data sets, the models preserved robustness
against large perturbations better than adversarially trained models.
Related papers
- Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness [52.9493817508055]
We propose Pre-trained Model Guided Adversarial Fine-Tuning (PMG-AFT) to enhance the model's zero-shot adversarial robustness.
Our approach consistently improves clean accuracy by an average of 8.72%.
arXiv Detail & Related papers (2024-01-09T04:33:03Z) - Reducing Adversarial Training Cost with Gradient Approximation [0.3916094706589679]
We propose a new and efficient adversarial training method, adversarial training with gradient approximation (GAAT) to reduce the cost of building up robust models.
Our proposed method saves up to 60% of the training time with comparable model test accuracy on datasets.
arXiv Detail & Related papers (2023-09-18T03:55:41Z) - Mutual Adversarial Training: Learning together is better than going
alone [82.78852509965547]
We study how interactions among models affect robustness via knowledge distillation.
We propose mutual adversarial training (MAT) in which multiple models are trained together.
MAT can effectively improve model robustness and outperform state-of-the-art methods under white-box attacks.
arXiv Detail & Related papers (2021-12-09T15:59:42Z) - Multi-stage Optimization based Adversarial Training [16.295921205749934]
We propose a Multi-stage Optimization based Adversarial Training (MOAT) method that periodically trains the model on mixed benign examples.
Under similar amount of training overhead, the proposed MOAT exhibits better robustness than either single-step or multi-step adversarial training methods.
arXiv Detail & Related papers (2021-06-26T07:59:52Z) - Adversarial Feature Stacking for Accurate and Robust Predictions [4.208059346198116]
Adversarial Feature Stacking (AFS) model can jointly take advantage of features with varied levels of robustness and accuracy.
We evaluate the AFS model on CIFAR-10 and CIFAR-100 datasets with strong adaptive attack methods.
arXiv Detail & Related papers (2021-03-24T12:01:24Z) - Voting based ensemble improves robustness of defensive models [82.70303474487105]
We study whether it is possible to create an ensemble to further improve robustness.
By ensembling several state-of-the-art pre-trained defense models, our method can achieve a 59.8% robust accuracy.
arXiv Detail & Related papers (2020-11-28T00:08:45Z) - To be Robust or to be Fair: Towards Fairness in Adversarial Training [83.42241071662897]
We find that adversarial training algorithms tend to introduce severe disparity of accuracy and robustness between different groups of data.
We propose a Fair-Robust-Learning (FRL) framework to mitigate this unfairness problem when doing adversarial defenses.
arXiv Detail & Related papers (2020-10-13T02:21:54Z) - Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning [134.15174177472807]
We introduce adversarial training into self-supervision, to provide general-purpose robust pre-trained models for the first time.
We conduct extensive experiments to demonstrate that the proposed framework achieves large performance margins.
arXiv Detail & Related papers (2020-03-28T18:28:33Z) - Efficient Adversarial Training with Transferable Adversarial Examples [58.62766224452761]
We show that there is high transferability between models from neighboring epochs in the same training process.
We propose a novel method, Adversarial Training with Transferable Adversarial Examples (ATTA) that can enhance the robustness of trained models.
arXiv Detail & Related papers (2019-12-27T03:05:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.