Soft Adversarial Training Can Retain Natural Accuracy
- URL: http://arxiv.org/abs/2206.01904v1
- Date: Sat, 4 Jun 2022 04:13:25 GMT
- Title: Soft Adversarial Training Can Retain Natural Accuracy
- Authors: Abhijith Sharma and Apurva Narayan
- Abstract summary: We propose a training framework that can retain natural accuracy without sacrificing robustness in a constrained setting.
Our framework specifically targets moderately critical applications which require a reasonable balance between robustness and accuracy.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adversarial training for neural networks has been in the limelight in recent
years. The advancement in neural network architectures over the last decade has
led to significant improvement in their performance. It sparked an interest in
their deployment for real-time applications. This process initiated the need to
understand the vulnerability of these models to adversarial attacks. It is
instrumental in designing models that are robust against adversaries. Recent
works have proposed novel techniques to counter the adversaries, most often
sacrificing natural accuracy. Most suggest training with an adversarial version
of the inputs, constantly moving away from the original distribution. The focus
of our work is to use abstract certification to extract a subset of inputs for
(hence we call it 'soft') adversarial training. We propose a training framework
that can retain natural accuracy without sacrificing robustness in a
constrained setting. Our framework specifically targets moderately critical
applications which require a reasonable balance between robustness and
accuracy. The results testify to the idea of soft adversarial training for the
defense against adversarial attacks. At last, we propose the scope of future
work for further improvement of this framework.
Related papers
- Enhancing Adversarial Training via Reweighting Optimization Trajectory [72.75558017802788]
A number of approaches have been proposed to address drawbacks such as extra regularization, adversarial weights, and training with more data.
We propose a new method named textbfWeighted Optimization Trajectories (WOT) that leverages the optimization trajectories of adversarial training in time.
Our results show that WOT integrates seamlessly with the existing adversarial training methods and consistently overcomes the robust overfitting issue.
arXiv Detail & Related papers (2023-06-25T15:53:31Z) - Improved Adversarial Training Through Adaptive Instance-wise Loss
Smoothing [5.1024659285813785]
Adversarial training has been the most successful defense against such adversarial attacks.
We propose a new adversarial training method: Instance-adaptive Smoothness Enhanced Adversarial Training.
Our method achieves state-of-the-art robustness against $ell_infty$-norm constrained attacks.
arXiv Detail & Related papers (2023-03-24T15:41:40Z) - Pruning Adversarially Robust Neural Networks without Adversarial
Examples [27.952904247130263]
We propose a novel framework to prune a robust neural network while maintaining adversarial robustness.
We leverage concurrent self-distillation and pruning to preserve knowledge in the original model as well as regularizing the pruned model via the Hilbert-Schmidt Information Bottleneck.
arXiv Detail & Related papers (2022-10-09T17:48:50Z) - Adversarial Coreset Selection for Efficient Robust Training [11.510009152620666]
We show how selecting a small subset of training data provides a principled approach to reducing the time complexity of robust training.
We conduct extensive experiments to demonstrate that our approach speeds up adversarial training by 2-3 times.
arXiv Detail & Related papers (2022-09-13T07:37:53Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Self-Progressing Robust Training [146.8337017922058]
Current robust training methods such as adversarial training explicitly uses an "attack" to generate adversarial examples.
We propose a new framework called SPROUT, self-progressing robust training.
Our results shed new light on scalable, effective and attack-independent robust training methods.
arXiv Detail & Related papers (2020-12-22T00:45:24Z) - Stylized Adversarial Defense [105.88250594033053]
adversarial training creates perturbation patterns and includes them in the training set to robustify the model.
We propose to exploit additional information from the feature space to craft stronger adversaries.
Our adversarial training approach demonstrates strong robustness compared to state-of-the-art defenses.
arXiv Detail & Related papers (2020-07-29T08:38:10Z) - Adversarial Feature Desensitization [12.401175943131268]
We propose a novel approach to adversarial robustness, which builds upon the insights from the domain adaptation field.
Our method, called Adversarial Feature Desensitization (AFD), aims at learning features that are invariant towards adversarial perturbations of the inputs.
arXiv Detail & Related papers (2020-06-08T14:20:02Z) - Class-Aware Domain Adaptation for Improving Adversarial Robustness [27.24720754239852]
adversarial training has been proposed to train networks by injecting adversarial examples into the training data.
We propose a novel Class-Aware Domain Adaptation (CADA) method for adversarial defense without directly applying adversarial training.
arXiv Detail & Related papers (2020-05-10T03:45:19Z) - Towards Achieving Adversarial Robustness by Enforcing Feature
Consistency Across Bit Planes [51.31334977346847]
We train networks to form coarse impressions based on the information in higher bit planes, and use the lower bit planes only to refine their prediction.
We demonstrate that, by imposing consistency on the representations learned across differently quantized images, the adversarial robustness of networks improves significantly.
arXiv Detail & Related papers (2020-04-01T09:31:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.