Mitigating Accuracy-Robustness Trade-off via Balanced Multi-Teacher Adversarial Distillation
- URL: http://arxiv.org/abs/2306.16170v3
- Date: Sun, 16 Jun 2024 07:14:23 GMT
- Title: Mitigating Accuracy-Robustness Trade-off via Balanced Multi-Teacher Adversarial Distillation
- Authors: Shiji Zhao, Xizhe Wang, Xingxing Wei,
- Abstract summary: Adversarial Training is a practical approach for improving the robustness of deep neural networks against adversarial attacks.
We introduce Balanced Multi-Teacher Adversarial Robustness Distillation (B-MTARD) to guide the model's Adversarial Training process.
B-MTARD outperforms the state-of-the-art methods against various adversarial attacks.
- Score: 12.39860047886679
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial Training is a practical approach for improving the robustness of deep neural networks against adversarial attacks. Although bringing reliable robustness, the performance towards clean examples is negatively affected after Adversarial Training, which means a trade-off exists between accuracy and robustness. Recently, some studies have tried to use knowledge distillation methods in Adversarial Training, achieving competitive performance in improving the robustness but the accuracy for clean samples is still limited. In this paper, to mitigate the accuracy-robustness trade-off, we introduce the Balanced Multi-Teacher Adversarial Robustness Distillation (B-MTARD) to guide the model's Adversarial Training process by applying a strong clean teacher and a strong robust teacher to handle the clean examples and adversarial examples, respectively. During the optimization process, to ensure that different teachers show similar knowledge scales, we design the Entropy-Based Balance algorithm to adjust the teacher's temperature and keep the teachers' information entropy consistent. Besides, to ensure that the student has a relatively consistent learning speed from multiple teachers, we propose the Normalization Loss Balance algorithm to adjust the learning weights of different types of knowledge. A series of experiments conducted on three public datasets demonstrate that B-MTARD outperforms the state-of-the-art methods against various adversarial attacks.
Related papers
- Dynamic Guidance Adversarial Distillation with Enhanced Teacher Knowledge [17.382306203152943]
Dynamic Guidance Adversarial Distillation (DGAD) framework tackles the challenge of differential sample importance.
DGAD employs Misclassification-Aware Partitioning (MAP) to dynamically tailor the distillation focus.
Error-corrective Label Swapping (ELS) corrects misclassifications of the teacher on both clean and adversarially perturbed inputs.
arXiv Detail & Related papers (2024-09-03T05:52:37Z) - Perturbation-Invariant Adversarial Training for Neural Ranking Models:
Improving the Effectiveness-Robustness Trade-Off [107.35833747750446]
adversarial examples can be crafted by adding imperceptible perturbations to legitimate documents.
This vulnerability raises significant concerns about their reliability and hinders the widespread deployment of NRMs.
In this study, we establish theoretical guarantees regarding the effectiveness-robustness trade-off in NRMs.
arXiv Detail & Related papers (2023-12-16T05:38:39Z) - Faithful Knowledge Distillation [75.59907631395849]
We focus on two crucial questions with regard to a teacher-student pair: (i) do the teacher and student disagree at points close to correctly classified dataset examples, and (ii) is the distilled student as confident as the teacher around dataset examples?
These are critical questions when considering the deployment of a smaller student network trained from a robust teacher within a safety-critical setting.
arXiv Detail & Related papers (2023-06-07T13:41:55Z) - Learning Sample Reweighting for Accuracy and Adversarial Robustness [15.591611864928659]
We propose a novel adversarial training framework that learns to reweight the loss associated with individual training samples based on a notion of class-conditioned margin.
Our approach consistently improves both clean and robust accuracy compared to related methods and state-of-the-art baselines.
arXiv Detail & Related papers (2022-10-20T18:25:11Z) - DAFT: Distilling Adversarially Fine-tuned Models for Better OOD
Generalization [35.53270942633211]
We consider the problem of OOD generalization, where the goal is to train a model that performs well on test distributions that are different from the training distribution.
We propose a new method - DAFT - based on the intuition that adversarially robust combination of a large number of rich features should provide OOD robustness.
We evaluate DAFT on standard benchmarks in the DomainBed framework, and demonstrate that DAFT achieves significant improvements over the current state-of-the-art OOD generalization methods.
arXiv Detail & Related papers (2022-08-19T03:48:17Z) - Vanilla Feature Distillation for Improving the Accuracy-Robustness
Trade-Off in Adversarial Training [37.5115141623558]
We propose a Vanilla Feature Distillation Adversarial Training (VFD-Adv) to guide adversarial training towards higher accuracy.
A key advantage of our method is that it can be universally adapted to and boost existing works.
arXiv Detail & Related papers (2022-06-05T11:57:10Z) - On the benefits of knowledge distillation for adversarial robustness [53.41196727255314]
We show that knowledge distillation can be used directly to boost the performance of state-of-the-art models in adversarial robustness.
We present Adversarial Knowledge Distillation (AKD), a new framework to improve a model's robust performance.
arXiv Detail & Related papers (2022-03-14T15:02:13Z) - Knowledge Distillation as Semiparametric Inference [44.572422527672416]
A popular approach to model compression is to train an inexpensive student model to mimic the class probabilities of a highly accurate but cumbersome teacher model.
This two-step knowledge distillation process often leads to higher accuracy than training the student directly on labeled data.
We cast knowledge distillation as a semiparametric inference problem with the optimal student model as the target, the unknown Bayes class probabilities as nuisance, and the teacher probabilities as a plug-in nuisance estimate.
arXiv Detail & Related papers (2021-04-20T03:00:45Z) - Robust Pre-Training by Adversarial Contrastive Learning [120.33706897927391]
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness.
We improve robustness-aware self-supervised pre-training by learning representations consistent under both data augmentations and adversarial perturbations.
arXiv Detail & Related papers (2020-10-26T04:44:43Z) - Once-for-All Adversarial Training: In-Situ Tradeoff between Robustness
and Accuracy for Free [115.81899803240758]
Adversarial training and its many variants substantially improve deep network robustness, yet at the cost of compromising standard accuracy.
This paper asks how to quickly calibrate a trained model in-situ, to examine the achievable trade-offs between its standard and robust accuracies.
Our proposed framework, Once-for-all Adversarial Training (OAT), is built on an innovative model-conditional training framework.
arXiv Detail & Related papers (2020-10-22T16:06:34Z) - Adversarial Robustness on In- and Out-Distribution Improves
Explainability [109.68938066821246]
RATIO is a training procedure for robustness via Adversarial Training on In- and Out-distribution.
RATIO achieves state-of-the-art $l$-adrial on CIFAR10 and maintains better clean accuracy.
arXiv Detail & Related papers (2020-03-20T18:57:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.