Dual Head Adversarial Training
- URL: http://arxiv.org/abs/2104.10377v2
- Date: Thu, 22 Apr 2021 06:01:25 GMT
- Title: Dual Head Adversarial Training
- Authors: Yujing Jiang, Xingjun Ma, Sarah Monazam Erfani and James Bailey
- Abstract summary: Deep neural networks (DNNs) are known to be vulnerable to adversarial examples/attacks.
Recent studies have shown that there exists an inherent tradeoff between accuracy and robustness in adversarially-trained DNNs.
We propose a novel technique Dual Head Adversarial Training (DH-AT) to further improve the robustness of existing adversarial training methods.
- Score: 31.538325500032
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNNs) are known to be vulnerable to adversarial
examples/attacks, raising concerns about their reliability in safety-critical
applications. A number of defense methods have been proposed to train robust
DNNs resistant to adversarial attacks, among which adversarial training has so
far demonstrated the most promising results. However, recent studies have shown
that there exists an inherent tradeoff between accuracy and robustness in
adversarially-trained DNNs. In this paper, we propose a novel technique Dual
Head Adversarial Training (DH-AT) to further improve the robustness of existing
adversarial training methods. Different from existing improved variants of
adversarial training, DH-AT modifies both the architecture of the network and
the training strategy to seek more robustness. Specifically, DH-AT first
attaches a second network head (or branch) to one intermediate layer of the
network, then uses a lightweight convolutional neural network (CNN) to
aggregate the outputs of the two heads. The training strategy is also adapted
to reflect the relative importance of the two heads. We empirically show, on
multiple benchmark datasets, that DH-AT can bring notable robustness
improvements to existing adversarial training methods. Compared with TRADES,
one state-of-the-art adversarial training method, our DH-AT can improve the
robustness by 3.4% against PGD40 and 2.3% against AutoAttack, and also improve
the clean accuracy by 1.8%.
Related papers
- Outlier Robust Adversarial Training [57.06824365801612]
We introduce Outlier Robust Adversarial Training (ORAT) in this work.
ORAT is based on a bi-level optimization formulation of adversarial training with a robust rank-based loss function.
We show that the learning objective of ORAT satisfies the $mathcalH$-consistency in binary classification, which establishes it as a proper surrogate to adversarial 0/1 loss.
arXiv Detail & Related papers (2023-09-10T21:36:38Z) - Enhancing Adversarial Training via Reweighting Optimization Trajectory [72.75558017802788]
A number of approaches have been proposed to address drawbacks such as extra regularization, adversarial weights, and training with more data.
We propose a new method named textbfWeighted Optimization Trajectories (WOT) that leverages the optimization trajectories of adversarial training in time.
Our results show that WOT integrates seamlessly with the existing adversarial training methods and consistently overcomes the robust overfitting issue.
arXiv Detail & Related papers (2023-06-25T15:53:31Z) - CAT:Collaborative Adversarial Training [80.55910008355505]
We propose a collaborative adversarial training framework to improve the robustness of neural networks.
Specifically, we use different adversarial training methods to train robust models and let models interact with their knowledge during the training process.
Cat achieves state-of-the-art adversarial robustness without using any additional data on CIFAR-10 under the Auto-Attack benchmark.
arXiv Detail & Related papers (2023-03-27T05:37:43Z) - Improved Adversarial Training Through Adaptive Instance-wise Loss
Smoothing [5.1024659285813785]
Adversarial training has been the most successful defense against such adversarial attacks.
We propose a new adversarial training method: Instance-adaptive Smoothness Enhanced Adversarial Training.
Our method achieves state-of-the-art robustness against $ell_infty$-norm constrained attacks.
arXiv Detail & Related papers (2023-03-24T15:41:40Z) - AccelAT: A Framework for Accelerating the Adversarial Training of Deep
Neural Networks through Accuracy Gradient [12.118084418840152]
Adrial training is exploited to develop a robust Deep Neural Network (DNN) model against malicious altered data.
This paper aims at accelerating the adversarial training to enable fast development of robust DNN models against adversarial attacks.
arXiv Detail & Related papers (2022-10-13T10:31:51Z) - Enhancing Adversarial Training with Feature Separability [52.39305978984573]
We introduce a new concept of adversarial training graph (ATG) with which the proposed adversarial training with feature separability (ATFS) enables to boost the intra-class feature similarity and increase inter-class feature variance.
Through comprehensive experiments, we demonstrate that the proposed ATFS framework significantly improves both clean and robust performance.
arXiv Detail & Related papers (2022-05-02T04:04:23Z) - THAT: Two Head Adversarial Training for Improving Robustness at Scale [126.06873298511425]
We propose Two Head Adversarial Training (THAT), a two-stream adversarial learning network that is designed to handle the large-scale many-class ImageNet dataset.
The proposed method trains a network with two heads and two loss functions; one to minimize feature-space domain shift between natural and adversarial images, and one to promote high classification accuracy.
arXiv Detail & Related papers (2021-03-25T05:32:38Z) - Self-Progressing Robust Training [146.8337017922058]
Current robust training methods such as adversarial training explicitly uses an "attack" to generate adversarial examples.
We propose a new framework called SPROUT, self-progressing robust training.
Our results shed new light on scalable, effective and attack-independent robust training methods.
arXiv Detail & Related papers (2020-12-22T00:45:24Z) - Improving the affordability of robustness training for DNNs [11.971637253035107]
We show that the initial phase of adversarial training is redundant and can be replaced with natural training which significantly improves the computational efficiency.
We show that our proposed method can reduce the training time by a factor of up to 2.5 with comparable or better model test accuracy and generalization on various strengths of adversarial attacks.
arXiv Detail & Related papers (2020-02-11T07:29:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.