Related papers: Distilling Holistic Knowledge with Graph Neural Networks

Distilling Holistic Knowledge with Graph Neural Networks

URL: http://arxiv.org/abs/2108.05507v1
Date: Thu, 12 Aug 2021 02:47:59 GMT
Title: Distilling Holistic Knowledge with Graph Neural Networks
Authors: Sheng Zhou, Yucheng Wang, Defang Chen, Jiawei Chen, Xin Wang, Can Wang, Jiajun Bu
Abstract summary: Knowledge Distillation (KD) aims at transferring knowledge from a larger well-optimized teacher network to a smaller learnable student network. Existing KD methods have mainly considered two types of knowledge, namely the individual knowledge and the relational knowledge. We propose to distill the novel holistic knowledge based on an attributed graph constructed among instances.
Score: 37.86539695906857
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Knowledge Distillation (KD) aims at transferring knowledge from a larger well-optimized teacher network to a smaller learnable student network.Existing KD methods have mainly considered two types of knowledge, namely the individual knowledge and the relational knowledge. However, these two types of knowledge are usually modeled independently while the inherent correlations between them are largely ignored. It is critical for sufficient student network learning to integrate both individual knowledge and relational knowledge while reserving their inherent correlation. In this paper, we propose to distill the novel holistic knowledge based on an attributed graph constructed among instances. The holistic knowledge is represented as a unified graph-based embedding by aggregating individual knowledge from relational neighborhood samples with graph neural networks, the student network is learned by distilling the holistic knowledge in a contrastive manner. Extensive experiments and ablation studies are conducted on benchmark datasets, the results demonstrate the effectiveness of the proposed method. The code has been published in https://github.com/wyc-ruiker/HKD

Related papers

Leveraging Pedagogical Theories to Understand Student Learning Process with Graph-based Reasonable Knowledge Tracing [11.082908318943248]
We introduce GRKT, a graph-based reasonable knowledge tracing method to address these issues. We propose a fine-grained and psychological three-stage modeling process as knowledge retrieval, memory strengthening, and knowledge learning/forgetting.
arXiv Detail & Related papers (2024-06-07T10:14:30Z)
Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph Propagation [68.13453771001522]
We propose a multimodal intensive ZSL framework that matches regions of images with corresponding semantic embeddings. We conduct extensive experiments and evaluate our model on large-scale real-world data.
arXiv Detail & Related papers (2023-06-14T13:07:48Z)
A Closer Look at Knowledge Distillation with Features, Logits, and Gradients [81.39206923719455]
Knowledge distillation (KD) is a substantial strategy for transferring learned knowledge from one neural network model to another. This work provides a new perspective to motivate a set of knowledge distillation strategies by approximating the classical KL-divergence criteria with different knowledge sources. Our analysis indicates that logits are generally a more efficient knowledge source and suggests that having sufficient feature dimensions is crucial for the model design.
arXiv Detail & Related papers (2022-03-18T21:26:55Z)
Online Adversarial Knowledge Distillation for Graph Neural Networks [25.902263307225816]
Knowledge distillation is used to enhance model generalization in Convolutional Neural Networks (CNNs) In this paper, we propose an online adversarial distillation approach to train a group of graph neural networks.
arXiv Detail & Related papers (2021-12-28T02:30:11Z)
Deep Ensemble Collaborative Learning by using Knowledge-transfer Graph for Fine-grained Object Classification [9.49864824780503]
The performance of ensembles of networks that have undergone mutual learning does not improve significantly from that of normal ensembles without mutual learning. This may be due to the relationship between the knowledge in mutual learning and the individuality of the networks in the ensemble. We propose an ensemble method using knowledge transfer to improve the accuracy of ensembles by introducing a loss design that promotes diversity among networks in mutual learning.
arXiv Detail & Related papers (2021-03-27T08:56:00Z)
Towards a Universal Continuous Knowledge Base [49.95342223987143]
We propose a method for building a continuous knowledge base that can store knowledge imported from multiple neural networks. Experiments on text classification show promising results. We import the knowledge from multiple models to the knowledge base, from which the fused knowledge is exported back to a single model.
arXiv Detail & Related papers (2020-12-25T12:27:44Z)
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning [93.18238573921629]
We study how Ensemble of deep learning models can improve test accuracy, and how the superior performance of ensemble can be distilled into a single model. We show that ensemble/knowledge distillation in deep learning works very differently from traditional learning theory. We prove that self-distillation can also be viewed as implicitly combining ensemble and knowledge distillation to improve test accuracy.
arXiv Detail & Related papers (2020-12-17T18:34:45Z)
Multi-level Knowledge Distillation [13.71183256776644]
We introduce Multi-level Knowledge Distillation (MLKD) to transfer richer representational knowledge from teacher to student networks. MLKD employs three novel teacher-student similarities: individual similarity, relational similarity, and categorical similarity. Experiments demonstrate that MLKD outperforms other state-of-the-art methods on both similar-architecture and cross-architecture tasks.
arXiv Detail & Related papers (2020-12-01T15:27:15Z)
Generative Adversarial Zero-Shot Relational Learning for Knowledge Graphs [96.73259297063619]
We consider a novel formulation, zero-shot learning, to free this cumbersome curation. For newly-added relations, we attempt to learn their semantic features from their text descriptions. We leverage Generative Adrial Networks (GANs) to establish the connection between text and knowledge graph domain.
arXiv Detail & Related papers (2020-01-08T01:19:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.