Learning the Wrong Lessons: Inserting Trojans During Knowledge
Distillation
- URL: http://arxiv.org/abs/2303.05593v1
- Date: Thu, 9 Mar 2023 21:37:50 GMT
- Title: Learning the Wrong Lessons: Inserting Trojans During Knowledge
Distillation
- Authors: Leonard Tang, Tom Shlomi, Alexander Cai
- Abstract summary: Trojan attacks have contemporaneously gained significant prominence, revealing fundamental vulnerabilities in deep learning models.
We seek to exploit the unlabelled data knowledge distillation process to embed Trojans in a student model without introducing conspicuous behavior in the teacher.
We devise a Trojan attack that effectively reduces student accuracy, does not alter teacher performance, and is efficiently constructible in practice.
- Score: 68.8204255655161
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In recent years, knowledge distillation has become a cornerstone of
efficiently deployed machine learning, with labs and industries using knowledge
distillation to train models that are inexpensive and resource-optimized.
Trojan attacks have contemporaneously gained significant prominence, revealing
fundamental vulnerabilities in deep learning models. Given the widespread use
of knowledge distillation, in this work we seek to exploit the unlabelled data
knowledge distillation process to embed Trojans in a student model without
introducing conspicuous behavior in the teacher. We ultimately devise a Trojan
attack that effectively reduces student accuracy, does not alter teacher
performance, and is efficiently constructible in practice.
Related papers
- A Survey on Recent Teacher-student Learning Studies [0.0]
Knowledge distillation is a method of transferring the knowledge from a complex deep neural network (DNN) to a smaller and faster DNN.
Recent variants of knowledge distillation include teaching assistant distillation, curriculum distillation, mask distillation, and decoupling distillation.
arXiv Detail & Related papers (2023-04-10T14:30:28Z) - Unbiased Knowledge Distillation for Recommendation [66.82575287129728]
Knowledge distillation (KD) has been applied in recommender systems (RS) to reduce inference latency.
Traditional solutions first train a full teacher model from the training data, and then transfer its knowledge to supervise the learning of a compact student model.
We find such a standard distillation paradigm would incur serious bias issue -- popular items are more heavily recommended after the distillation.
arXiv Detail & Related papers (2022-11-27T05:14:03Z) - ARDIR: Improving Robustness using Knowledge Distillation of Internal
Representation [2.0875529088206553]
We propose Adversarial Robust Distillation with Internal Representation(ARDIR) to utilize knowledge distillation even more effectively.
ARDIR uses the internal representation of the teacher model as a label for adversarial training.
We show that ARDIR outperforms previous methods in our experiments.
arXiv Detail & Related papers (2022-11-01T03:11:59Z) - Distilling the Undistillable: Learning from a Nasty Teacher [30.0248670422039]
We develop efficient methodologies to increase the learning from Nasty Teacher by upto 68.63% on standard datasets.
We also explore an improvised defense method based on our insights of stealing.
Our detailed set of experiments and ablations on diverse models/settings demonstrate the efficacy of our approach.
arXiv Detail & Related papers (2022-10-21T04:35:44Z) - On the benefits of knowledge distillation for adversarial robustness [53.41196727255314]
We show that knowledge distillation can be used directly to boost the performance of state-of-the-art models in adversarial robustness.
We present Adversarial Knowledge Distillation (AKD), a new framework to improve a model's robust performance.
arXiv Detail & Related papers (2022-03-14T15:02:13Z) - Dynamic Rectification Knowledge Distillation [0.0]
Dynamic Rectification Knowledge Distillation (DR-KD) is a knowledge distillation framework.
DR-KD transforms the student into its own teacher, and if the self-teacher makes wrong predictions while distilling information, the error is rectified prior to the knowledge being distilled.
Our proposed DR-KD performs remarkably well in the absence of a sophisticated cumbersome teacher model.
arXiv Detail & Related papers (2022-01-27T04:38:01Z) - Fixing the Teacher-Student Knowledge Discrepancy in Distillation [72.4354883997316]
We propose a novel student-dependent distillation method, knowledge consistent distillation, which makes teacher's knowledge more consistent with the student.
Our method is very flexible that can be easily combined with other state-of-the-art approaches.
arXiv Detail & Related papers (2021-03-31T06:52:20Z) - Learning Student-Friendly Teacher Networks for Knowledge Distillation [50.11640959363315]
We propose a novel knowledge distillation approach to facilitate the transfer of dark knowledge from a teacher to a student.
Contrary to most of the existing methods that rely on effective training of student models given pretrained teachers, we aim to learn the teacher models that are friendly to students.
arXiv Detail & Related papers (2021-02-12T07:00:17Z) - Collaborative Teacher-Student Learning via Multiple Knowledge Transfer [79.45526596053728]
We propose a collaborative teacher-student learning via multiple knowledge transfer (CTSL-MKT)
It allows multiple students learn knowledge from both individual instances and instance relations in a collaborative way.
The experiments and ablation studies on four image datasets demonstrate that the proposed CTSL-MKT significantly outperforms the state-of-the-art KD methods.
arXiv Detail & Related papers (2021-01-21T07:17:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.