P-KDGAN: Progressive Knowledge Distillation with GANs for One-class
Novelty Detection
- URL: http://arxiv.org/abs/2007.06963v2
- Date: Sun, 25 Jul 2021 17:25:43 GMT
- Title: P-KDGAN: Progressive Knowledge Distillation with GANs for One-class
Novelty Detection
- Authors: Zhiwei Zhang, Shifeng Chen and Lei Sun
- Abstract summary: One-class novelty detection is to identify anomalous instances that do not conform to the expected normal instances.
Deep neural networks are too over- parameterized to deploy on resource-limited devices.
Progressive Knowledge Distillation with GANs (PKDGAN) is proposed to learn compact and fast novelty detection networks.
- Score: 24.46562699161406
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One-class novelty detection is to identify anomalous instances that do not
conform to the expected normal instances. In this paper, the Generative
Adversarial Networks (GANs) based on encoder-decoder-encoder pipeline are used
for detection and achieve state-of-the-art performance. However, deep neural
networks are too over-parameterized to deploy on resource-limited devices.
Therefore, Progressive Knowledge Distillation with GANs (PKDGAN) is proposed to
learn compact and fast novelty detection networks. The P-KDGAN is a novel
attempt to connect two standard GANs by the designed distillation loss for
transferring knowledge from the teacher to the student. The progressive
learning of knowledge distillation is a two-step approach that continuously
improves the performance of the student GAN and achieves better performance
than single step methods. In the first step, the student GAN learns the basic
knowledge totally from the teacher via guiding of the pretrained teacher GAN
with fixed weights. In the second step, joint fine-training is adopted for the
knowledgeable teacher and student GANs to further improve the performance and
stability. The experimental results on CIFAR-10, MNIST, and FMNIST show that
our method improves the performance of the student GAN by 2.44%, 1.77%, and
1.73% when compressing the computation at ratios of 24.45:1, 311.11:1, and
700:1, respectively.
Related papers
- Learning Lightweight Object Detectors via Multi-Teacher Progressive
Distillation [56.053397775016755]
We propose a sequential approach to knowledge distillation that progressively transfers the knowledge of a set of teacher detectors to a given lightweight student.
To the best of our knowledge, we are the first to successfully distill knowledge from Transformer-based teacher detectors to convolution-based students.
arXiv Detail & Related papers (2023-08-17T17:17:08Z) - Improving Knowledge Distillation via Regularizing Feature Norm and
Direction [16.98806338782858]
Knowledge distillation (KD) exploits a large well-trained model (i.e., teacher) to train a small student model on the same dataset for the same task.
Treating teacher features as knowledge, prevailing methods of knowledge distillation train student by aligning its features with the teacher's, e.g., by minimizing the KL-divergence between their logits or L2 distance between their intermediate features.
While it is natural to believe that better alignment of student features to the teacher better distills teacher knowledge, simply forcing this alignment does not directly contribute to the student's performance, e.g.
arXiv Detail & Related papers (2023-05-26T15:05:19Z) - Knowledge Diffusion for Distillation [53.908314960324915]
The representation gap between teacher and student is an emerging topic in knowledge distillation (KD)
We state that the essence of these methods is to discard the noisy information and distill the valuable information in the feature.
We propose a novel KD method dubbed DiffKD, to explicitly denoise and match features using diffusion models.
arXiv Detail & Related papers (2023-05-25T04:49:34Z) - Exploring Inconsistent Knowledge Distillation for Object Detection with
Data Augmentation [66.25738680429463]
Knowledge Distillation (KD) for object detection aims to train a compact detector by transferring knowledge from a teacher model.
We propose inconsistent knowledge distillation (IKD) which aims to distill knowledge inherent in the teacher model's counter-intuitive perceptions.
Our method outperforms state-of-the-art KD baselines on one-stage, two-stage and anchor-free object detectors.
arXiv Detail & Related papers (2022-09-20T16:36:28Z) - New Perspective on Progressive GANs Distillation for One-class Novelty
Detection [21.90786581579228]
Generative Adversarial Network based on thecoder-Decoder-Encoder scheme (EDE-GAN) achieves state-of-the-art performance.
New technology, Progressive Knowledge Distillation with GANs (P-KDGAN) connects two standard GANs through the designed distillation loss.
Two-step progressive learning continuously augments the performance of student GANs with improved results over single-step approach.
arXiv Detail & Related papers (2021-09-15T13:45:30Z) - G-DetKD: Towards General Distillation Framework for Object Detectors via
Contrastive and Semantic-guided Feature Imitation [49.421099172544196]
We propose a novel semantic-guided feature imitation technique, which automatically performs soft matching between feature pairs across all pyramid levels.
We also introduce contrastive distillation to effectively capture the information encoded in the relationship between different feature regions.
Our method consistently outperforms the existing detection KD techniques, and works when (1) components in the framework are used separately and in conjunction.
arXiv Detail & Related papers (2021-08-17T07:44:27Z) - Dual Discriminator Adversarial Distillation for Data-free Model
Compression [36.49964835173507]
We propose Dual Discriminator Adversarial Distillation (DDAD) to distill a neural network without any training data or meta-data.
To be specific, we use a generator to create samples through dual discriminator adversarial distillation, which mimics the original training data.
The proposed method obtains an efficient student network which closely approximates its teacher network, despite using no original training data.
arXiv Detail & Related papers (2021-04-12T12:01:45Z) - Spirit Distillation: Precise Real-time Prediction with Insufficient Data [4.6247655021017655]
We propose a new training framework named Spirit Distillation(SD)
It extends the ideas of fine-tuning-based transfer learning(FTT) and feature-based knowledge distillation.
Results demonstrate the boosting performance in segmentation(mIOU) and high-precision accuracy boost by 1.4% and 8.2% respectively.
arXiv Detail & Related papers (2021-03-25T10:23:30Z) - On Self-Distilling Graph Neural Network [64.00508355508106]
We propose the first teacher-free knowledge distillation method for GNNs, termed GNN Self-Distillation (GNN-SD)
The method is built upon the proposed neighborhood discrepancy rate (NDR), which quantifies the non-smoothness of the embedded graph in an efficient way.
We also summarize a generic GNN-SD framework that could be exploited to induce other distillation strategies.
arXiv Detail & Related papers (2020-11-04T12:29:33Z) - Distilling Object Detectors with Task Adaptive Regularization [97.52935611385179]
Current state-of-the-art object detectors are at the expense of high computational costs and are hard to deploy to low-end devices.
Knowledge distillation, which aims at training a smaller student network by transferring knowledge from a larger teacher model, is one of the promising solutions for model miniaturization.
arXiv Detail & Related papers (2020-06-23T15:58:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.