Interpretable Embedding Procedure Knowledge Transfer via Stacked
Principal Component Analysis and Graph Neural Network
- URL: http://arxiv.org/abs/2104.13561v1
- Date: Wed, 28 Apr 2021 03:40:37 GMT
- Title: Interpretable Embedding Procedure Knowledge Transfer via Stacked
Principal Component Analysis and Graph Neural Network
- Authors: Seunghyun Lee, Byung Cheol Song
- Abstract summary: This paper proposes a method of generating interpretable embedding procedure (IEP) knowledge based on principal component analysis.
Experimental results show that the student network trained by the proposed KD method improves 2.28% in the CIFAR100 dataset.
We also demonstrate that the embedding procedure knowledge is interpretable via visualization of the proposed KD process.
- Score: 26.55774782646948
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge distillation (KD) is one of the most useful techniques for
light-weight neural networks. Although neural networks have a clear purpose of
embedding datasets into the low-dimensional space, the existing knowledge was
quite far from this purpose and provided only limited information. We argue
that good knowledge should be able to interpret the embedding procedure. This
paper proposes a method of generating interpretable embedding procedure (IEP)
knowledge based on principal component analysis, and distilling it based on a
message passing neural network. Experimental results show that the student
network trained by the proposed KD method improves 2.28% in the CIFAR100
dataset, which is higher performance than the state-of-the-art (SOTA) method.
We also demonstrate that the embedding procedure knowledge is interpretable via
visualization of the proposed KD process. The implemented code is available at
https://github.com/sseung0703/IEPKT.
Related papers
- Preserving Information: How does Topological Data Analysis improve Neural Network performance? [0.0]
We introduce a method for integrating Topological Data Analysis (TDA) with Convolutional Neural Networks (CNN) in the context of image recognition.
Our approach, further referred to as Vector Stitching, involves combining raw image data with additional topological information.
The results of our experiments highlight the potential of incorporating results of additional data analysis into the network's inference process.
arXiv Detail & Related papers (2024-11-27T14:56:05Z) - Image classification network enhancement methods based on knowledge
injection [8.885876832491917]
This paper proposes a multi-level hierarchical deep learning algorithm.
It is composed of multi-level hierarchical deep neural network architecture and multi-level hierarchical deep learning framework.
The experimental results show that the proposed algorithm can effectively explain the hidden information of the neural network.
arXiv Detail & Related papers (2024-01-09T09:11:41Z) - Towards Explainable Machine Learning: The Effectiveness of Reservoir
Computing in Wireless Receive Processing [21.843365090029987]
We investigate the specific task of channel equalization by applying a popular learning-based technique known as Reservoir Computing (RC)
RC has shown superior performance compared to conventional methods and other learning-based approaches.
We also show the improvement in receive processing/symbol detection performance with this optimized through simulations.
arXiv Detail & Related papers (2023-10-08T00:44:35Z) - Neuro-Symbolic Learning of Answer Set Programs from Raw Data [54.56905063752427]
Neuro-Symbolic AI aims to combine interpretability of symbolic techniques with the ability of deep learning to learn from raw data.
We introduce Neuro-Symbolic Inductive Learner (NSIL), an approach that trains a general neural network to extract latent concepts from raw data.
NSIL learns expressive knowledge, solves computationally complex problems, and achieves state-of-the-art performance in terms of accuracy and data efficiency.
arXiv Detail & Related papers (2022-05-25T12:41:59Z) - Adaptive Convolutional Dictionary Network for CT Metal Artifact
Reduction [62.691996239590125]
We propose an adaptive convolutional dictionary network (ACDNet) for metal artifact reduction.
Our ACDNet can automatically learn the prior for artifact-free CT images via training data and adaptively adjust the representation kernels for each input CT image.
Our method inherits the clear interpretability of model-based methods and maintains the powerful representation ability of learning-based methods.
arXiv Detail & Related papers (2022-05-16T06:49:36Z) - Impact of a DCT-driven Loss in Attention-based Knowledge-Distillation
for Scene Recognition [64.29650787243443]
We propose and analyse the use of a 2D frequency transform of the activation maps before transferring them.
This strategy enhances knowledge transferability in tasks such as scene recognition.
We publicly release the training and evaluation framework used along this paper at http://www.vpu.eps.uam.es/publications/DCTBasedKDForSceneRecognition.
arXiv Detail & Related papers (2022-05-04T11:05:18Z) - A Closer Look at Knowledge Distillation with Features, Logits, and
Gradients [81.39206923719455]
Knowledge distillation (KD) is a substantial strategy for transferring learned knowledge from one neural network model to another.
This work provides a new perspective to motivate a set of knowledge distillation strategies by approximating the classical KL-divergence criteria with different knowledge sources.
Our analysis indicates that logits are generally a more efficient knowledge source and suggests that having sufficient feature dimensions is crucial for the model design.
arXiv Detail & Related papers (2022-03-18T21:26:55Z) - Efficient training of lightweight neural networks using Online
Self-Acquired Knowledge Distillation [51.66271681532262]
Online Self-Acquired Knowledge Distillation (OSAKD) is proposed, aiming to improve the performance of any deep neural model in an online manner.
We utilize k-nn non-parametric density estimation technique for estimating the unknown probability distributions of the data samples in the output feature space.
arXiv Detail & Related papers (2021-08-26T14:01:04Z) - Residual Knowledge Distillation [96.18815134719975]
This work proposes Residual Knowledge Distillation (RKD), which further distills the knowledge by introducing an assistant (A)
In this way, S is trained to mimic the feature maps of T, and A aids this process by learning the residual error between them.
Experiments show that our approach achieves appealing results on popular classification datasets, CIFAR-100 and ImageNet.
arXiv Detail & Related papers (2020-02-21T07:49:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.