Explaining Knowledge Distillation by Quantifying the Knowledge
- URL: http://arxiv.org/abs/2003.03622v1
- Date: Sat, 7 Mar 2020 18:09:17 GMT
- Title: Explaining Knowledge Distillation by Quantifying the Knowledge
- Authors: Xu Cheng, Zhefan Rao, Yilan Chen, Quanshi Zhang
- Abstract summary: This paper presents a method to interpret the success of knowledge distillation by quantifying and analyzing task-relevant and task-irrelevant visual concepts.
Knowledge distillation makes the DNN learn more visual concepts than learning from raw data.
- Score: 27.98287660940717
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a method to interpret the success of knowledge
distillation by quantifying and analyzing task-relevant and task-irrelevant
visual concepts that are encoded in intermediate layers of a deep neural
network (DNN). More specifically, three hypotheses are proposed as follows. 1.
Knowledge distillation makes the DNN learn more visual concepts than learning
from raw data. 2. Knowledge distillation ensures that the DNN is prone to
learning various visual concepts simultaneously. Whereas, in the scenario of
learning from raw data, the DNN learns visual concepts sequentially. 3.
Knowledge distillation yields more stable optimization directions than learning
from raw data. Accordingly, we design three types of mathematical metrics to
evaluate feature representations of the DNN. In experiments, we diagnosed
various DNNs, and above hypotheses were verified.
Related papers
- BKDSNN: Enhancing the Performance of Learning-based Spiking Neural Networks Training with Blurred Knowledge Distillation [20.34272550256856]
Spiking neural networks (SNNs) mimic biological neural system to convey information via discrete spikes.
Our work achieves state-of-the-art performance for training SNNs on both static and neuromorphic datasets.
arXiv Detail & Related papers (2024-07-12T08:17:24Z) - Distribution Shift Matters for Knowledge Distillation with Webly
Collected Images [91.66661969598755]
We propose a novel method dubbed Knowledge Distillation between Different Distributions" (KD$3$)
We first dynamically select useful training instances from the webly collected data according to the combined predictions of teacher network and student network.
We also build a new contrastive learning block called MixDistribution to generate perturbed data with a new distribution for instance alignment.
arXiv Detail & Related papers (2023-07-21T10:08:58Z) - Quantifying the Knowledge in a DNN to Explain Knowledge Distillation for
Classification [27.98287660940717]
This paper provides a new perspective to explain the success of knowledge distillation, based on the information theory.
A knowledge point is referred to as an input unit, whose information is much less discarded than other input units.
We propose three hypotheses for knowledge distillation based on the quantification of knowledge points.
arXiv Detail & Related papers (2022-08-18T09:47:31Z) - Knowledge Enhanced Neural Networks for relational domains [83.9217787335878]
We focus on a specific method, KENN, a Neural-Symbolic architecture that injects prior logical knowledge into a neural network.
In this paper, we propose an extension of KENN for relational data.
arXiv Detail & Related papers (2022-05-31T13:00:34Z) - Neuro-Symbolic Learning of Answer Set Programs from Raw Data [54.56905063752427]
Neuro-Symbolic AI aims to combine interpretability of symbolic techniques with the ability of deep learning to learn from raw data.
We introduce Neuro-Symbolic Inductive Learner (NSIL), an approach that trains a general neural network to extract latent concepts from raw data.
NSIL learns expressive knowledge, solves computationally complex problems, and achieves state-of-the-art performance in terms of accuracy and data efficiency.
arXiv Detail & Related papers (2022-05-25T12:41:59Z) - Concept Embeddings for Fuzzy Logic Verification of Deep Neural Networks
in Perception Tasks [1.2246649738388387]
We present a simple, yet effective, approach to verify whether a trained convolutional neural network (CNN) respects specified symbolic background knowledge.
The knowledge may consist of any fuzzy predicate logic rules.
We show that this approach benefits from fuzziness and calibrating the concept outputs.
arXiv Detail & Related papers (2022-01-03T10:35:47Z) - What Do Deep Nets Learn? Class-wise Patterns Revealed in the Input Space [88.37185513453758]
We propose a method to visualize and understand the class-wise knowledge learned by deep neural networks (DNNs) under different settings.
Our method searches for a single predictive pattern in the pixel space to represent the knowledge learned by the model for each class.
In the adversarial setting, we show that adversarially trained models tend to learn more simplified shape patterns.
arXiv Detail & Related papers (2021-01-18T06:38:41Z) - Boosting Deep Neural Networks with Geometrical Prior Knowledge: A Survey [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art results in many different problem settings.
DNNs are often treated as black box systems, which complicates their evaluation and validation.
One promising field, inspired by the success of convolutional neural networks (CNNs) in computer vision tasks, is to incorporate knowledge about symmetric geometrical transformations.
arXiv Detail & Related papers (2020-06-30T14:56:05Z) - Architecture Disentanglement for Deep Neural Networks [174.16176919145377]
We introduce neural architecture disentanglement (NAD) to explain the inner workings of deep neural networks (DNNs)
NAD learns to disentangle a pre-trained DNN into sub-architectures according to independent tasks, forming information flows that describe the inference processes.
Results show that misclassified images have a high probability of being assigned to task sub-architectures similar to the correct ones.
arXiv Detail & Related papers (2020-03-30T08:34:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.