Feature Distillation With Guided Adversarial Contrastive Learning
- URL: http://arxiv.org/abs/2009.09922v1
- Date: Mon, 21 Sep 2020 14:46:17 GMT
- Title: Feature Distillation With Guided Adversarial Contrastive Learning
- Authors: Tao Bai, Jinnan Chen, Jun Zhao, Bihan Wen, Xudong Jiang, Alex Kot
- Abstract summary: We propose Guided Adversarial Contrastive Distillation (GACD) to transfer adversarial robustness from teacher to student with features.
With a well-trained teacher model as an anchor, students are expected to extract features similar to the teacher.
With GACD, the student not only learns to extract robust features, but also captures structural knowledge from the teacher.
- Score: 41.28710294669751
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning models are shown to be vulnerable to adversarial examples.
Though adversarial training can enhance model robustness, typical approaches
are computationally expensive. Recent works proposed to transfer the robustness
to adversarial attacks across different tasks or models with soft
labels.Compared to soft labels, feature contains rich semantic information and
holds the potential to be applied to different downstream tasks. In this paper,
we propose a novel approach called Guided Adversarial Contrastive Distillation
(GACD), to effectively transfer adversarial robustness from teacher to student
with features. We first formulate this objective as contrastive learning and
connect it with mutual information. With a well-trained teacher model as an
anchor, students are expected to extract features similar to the teacher. Then
considering the potential errors made by teachers, we propose sample reweighted
estimation to eliminate the negative effects from teachers. With GACD, the
student not only learns to extract robust features, but also captures
structural knowledge from the teacher. By extensive experiments evaluating over
popular datasets such as CIFAR-10, CIFAR-100 and STL-10, we demonstrate that
our approach can effectively transfer robustness across different models and
even different tasks, and achieve comparable or better results than existing
methods. Besides, we provide a detailed analysis of various methods, showing
that students produced by our approach capture more structural knowledge from
teachers and learn more robust features under adversarial attacks.
Related papers
- Relational Representation Distillation [6.24302896438145]
We introduce Representation Distillation (RRD) to explore and reinforce relationships between teacher and student models.
Inspired by self-supervised learning principles, it uses a relaxed contrastive loss that focuses on similarity than exact replication.
Our approach demonstrates superior performance on CIFAR-100 and ImageNet ILSVRC-2012 and sometimes even outperforms the teacher network when combined with KD.
arXiv Detail & Related papers (2024-07-16T14:56:13Z) - Distilling Adversarial Robustness Using Heterogeneous Teachers [9.404102810698202]
robustness can be transferred from an adversarially trained teacher to a student model using knowledge distillation.
We develop a defense framework against adversarial attacks by distilling robustness using heterogeneous teachers.
Experiments on classification tasks in both white-box and black-box scenarios demonstrate that DARHT achieves state-of-the-art clean and robust accuracies.
arXiv Detail & Related papers (2024-02-23T19:55:13Z) - Class Incremental Learning for Adversarial Robustness [17.06592851567578]
Adrial training integrates adversarial examples during model training to enhance robustness.
We observe that combining incremental learning with naive adversarial training easily leads to a loss of robustness.
We propose the Flatness Preserving Distillation (FPD) loss that leverages the output difference between adversarial and clean examples.
arXiv Detail & Related papers (2023-12-06T04:38:02Z) - Generalized Knowledge Distillation via Relationship Matching [53.69235109551099]
Knowledge of a well-trained deep neural network (a.k.a. the "teacher") is valuable for learning similar tasks.
Knowledge distillation extracts knowledge from the teacher and integrates it with the target model.
Instead of enforcing the teacher to work on the same task as the student, we borrow the knowledge from a teacher trained from a general label space.
arXiv Detail & Related papers (2022-05-04T06:49:47Z) - On the benefits of knowledge distillation for adversarial robustness [53.41196727255314]
We show that knowledge distillation can be used directly to boost the performance of state-of-the-art models in adversarial robustness.
We present Adversarial Knowledge Distillation (AKD), a new framework to improve a model's robust performance.
arXiv Detail & Related papers (2022-03-14T15:02:13Z) - Mutual Adversarial Training: Learning together is better than going
alone [82.78852509965547]
We study how interactions among models affect robustness via knowledge distillation.
We propose mutual adversarial training (MAT) in which multiple models are trained together.
MAT can effectively improve model robustness and outperform state-of-the-art methods under white-box attacks.
arXiv Detail & Related papers (2021-12-09T15:59:42Z) - Analysis and Applications of Class-wise Robustness in Adversarial
Training [92.08430396614273]
Adversarial training is one of the most effective approaches to improve model robustness against adversarial examples.
Previous works mainly focus on the overall robustness of the model, and the in-depth analysis on the role of each class involved in adversarial training is still missing.
We provide a detailed diagnosis of adversarial training on six benchmark datasets, i.e., MNIST, CIFAR-10, CIFAR-100, SVHN, STL-10 and ImageNet.
We observe that the stronger attack methods in adversarial learning achieve performance improvement mainly from a more successful attack on the vulnerable classes.
arXiv Detail & Related papers (2021-05-29T07:28:35Z) - Understanding Robustness in Teacher-Student Setting: A New Perspective [42.746182547068265]
Adrial examples are machine learning models where bounded adversarial perturbation could mislead the models to make arbitrarily incorrect predictions.
Extensive studies try to explain the existence of adversarial examples and provide ways to improve model robustness.
Our studies could shed light on the future exploration about adversarial examples, and enhancing model robustness via principled data augmentation.
arXiv Detail & Related papers (2021-02-25T20:54:24Z) - Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions.
We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples.
We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.