How and When Adversarial Robustness Transfers in Knowledge Distillation?
- URL: http://arxiv.org/abs/2110.12072v1
- Date: Fri, 22 Oct 2021 21:30:53 GMT
- Title: How and When Adversarial Robustness Transfers in Knowledge Distillation?
- Authors: Rulin Shao, Jinfeng Yi, Pin-Yu Chen, Cho-Jui Hsieh
- Abstract summary: This paper studies how and when the adversarial robustness can be transferred from a teacher model to a student model in Knowledge distillation (KD)
We show that standard KD training fails to preserve adversarial robustness, and we propose KD with input gradient alignment (KDIGA) for remedy.
Under certain assumptions, we prove that the student model using our proposed KDIGA can achieve at least the same certified robustness as the teacher model.
- Score: 137.11016173468457
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge distillation (KD) has been widely used in teacher-student training,
with applications to model compression in resource-constrained deep learning.
Current works mainly focus on preserving the accuracy of the teacher model.
However, other important model properties, such as adversarial robustness, can
be lost during distillation. This paper studies how and when the adversarial
robustness can be transferred from a teacher model to a student model in KD. We
show that standard KD training fails to preserve adversarial robustness, and we
propose KD with input gradient alignment (KDIGA) for remedy. Under certain
assumptions, we prove that the student model using our proposed KDIGA can
achieve at least the same certified robustness as the teacher model. Our
experiments of KD contain a diverse set of teacher and student models with
varying network architectures and sizes evaluated on ImageNet and CIFAR-10
datasets, including residual neural networks (ResNets) and vision transformers
(ViTs). Our comprehensive analysis shows several novel insights that (1) With
KDIGA, students can preserve or even exceed the adversarial robustness of the
teacher model, even when their models have fundamentally different
architectures; (2) KDIGA enables robustness to transfer to pre-trained
students, such as KD from an adversarially trained ResNet to a pre-trained ViT,
without loss of clean accuracy; and (3) Our derived local linearity bounds for
characterizing adversarial robustness in KD are consistent with the empirical
results.
Related papers
- Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling [81.00825302340984]
We introduce Speculative Knowledge Distillation (SKD) to generate high-quality training data on-the-fly.
In SKD, the student proposes tokens, and the teacher replaces poorly ranked ones based on its own distribution.
We evaluate SKD on various text generation tasks, including translation, summarization, math, and instruction following.
arXiv Detail & Related papers (2024-10-15T06:51:25Z) - Efficient and Robust Knowledge Distillation from A Stronger Teacher Based on Correlation Matching [0.09999629695552192]
Correlation Matching Knowledge Distillation (CMKD) method combines the Pearson and Spearman correlation coefficients-based KD loss to achieve more efficient and robust distillation from a stronger teacher model.
CMKD is simple yet practical, and extensive experiments demonstrate that it can consistently achieve state-of-the-art performance on CIRAR-100 and ImageNet.
arXiv Detail & Related papers (2024-10-09T05:42:47Z) - Exploring and Enhancing the Transfer of Distribution in Knowledge Distillation for Autoregressive Language Models [62.5501109475725]
Knowledge distillation (KD) is a technique that compresses large teacher models by training smaller student models to mimic them.
This paper introduces Online Knowledge Distillation (OKD), where the teacher network integrates small online modules to concurrently train with the student model.
OKD achieves or exceeds the performance of leading methods in various model architectures and sizes, reducing training time by up to fourfold.
arXiv Detail & Related papers (2024-09-19T07:05:26Z) - Robust Knowledge Distillation Based on Feature Variance Against Backdoored Teacher Model [13.367731896112861]
Knowledge distillation (KD) is one of the widely used compression techniques for edge deployment.
This paper proposes RobustKD, a robust KD that compresses the model while mitigating backdoor based on feature variance.
arXiv Detail & Related papers (2024-06-01T11:25:03Z) - Comparative Knowledge Distillation [102.35425896967791]
Traditional Knowledge Distillation (KD) assumes readily available access to teacher models for frequent inference.
We propose Comparative Knowledge Distillation (CKD), which encourages student models to understand the nuanced differences in a teacher model's interpretations of samples.
CKD consistently outperforms state of the art data augmentation and KD techniques.
arXiv Detail & Related papers (2023-11-03T21:55:33Z) - Undistillable: Making A Nasty Teacher That CANNOT teach students [84.6111281091602]
This paper introduces and investigates a concept called Nasty Teacher: a specially trained teacher network that yields nearly the same performance as a normal one.
We propose a simple yet effective algorithm to build the nasty teacher, called self-undermining knowledge distillation.
arXiv Detail & Related papers (2021-05-16T08:41:30Z) - KDExplainer: A Task-oriented Attention Model for Explaining Knowledge
Distillation [59.061835562314066]
We introduce a novel task-oriented attention model, termed as KDExplainer, to shed light on the working mechanism underlying the vanilla KD.
We also introduce a portable tool, dubbed as virtual attention module (VAM), that can be seamlessly integrated with various deep neural networks (DNNs) to enhance their performance under KD.
arXiv Detail & Related papers (2021-05-10T08:15:26Z) - Ensemble Knowledge Distillation for CTR Prediction [46.92149090885551]
We propose a new model training strategy based on knowledge distillation (KD)
KD is a teacher-student learning framework to transfer knowledge learned from a teacher model to a student model.
We propose some novel techniques to facilitate ensembled CTR prediction, including teacher gating and early stopping by distillation loss.
arXiv Detail & Related papers (2020-11-08T23:37:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.