Distilling Holistic Knowledge with Graph Neural Networks
- URL: http://arxiv.org/abs/2108.05507v1
- Date: Thu, 12 Aug 2021 02:47:59 GMT
- Title: Distilling Holistic Knowledge with Graph Neural Networks
- Authors: Sheng Zhou, Yucheng Wang, Defang Chen, Jiawei Chen, Xin Wang, Can
Wang, Jiajun Bu
- Abstract summary: Knowledge Distillation (KD) aims at transferring knowledge from a larger well-optimized teacher network to a smaller learnable student network.
Existing KD methods have mainly considered two types of knowledge, namely the individual knowledge and the relational knowledge.
We propose to distill the novel holistic knowledge based on an attributed graph constructed among instances.
- Score: 37.86539695906857
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge Distillation (KD) aims at transferring knowledge from a larger
well-optimized teacher network to a smaller learnable student network.Existing
KD methods have mainly considered two types of knowledge, namely the individual
knowledge and the relational knowledge. However, these two types of knowledge
are usually modeled independently while the inherent correlations between them
are largely ignored. It is critical for sufficient student network learning to
integrate both individual knowledge and relational knowledge while reserving
their inherent correlation. In this paper, we propose to distill the novel
holistic knowledge based on an attributed graph constructed among instances.
The holistic knowledge is represented as a unified graph-based embedding by
aggregating individual knowledge from relational neighborhood samples with
graph neural networks, the student network is learned by distilling the
holistic knowledge in a contrastive manner. Extensive experiments and ablation
studies are conducted on benchmark datasets, the results demonstrate the
effectiveness of the proposed method. The code has been published in
https://github.com/wyc-ruiker/HKD
Related papers
- Leveraging Pedagogical Theories to Understand Student Learning Process with Graph-based Reasonable Knowledge Tracing [11.082908318943248]
We introduce GRKT, a graph-based reasonable knowledge tracing method to address these issues.
We propose a fine-grained and psychological three-stage modeling process as knowledge retrieval, memory strengthening, and knowledge learning/forgetting.
arXiv Detail & Related papers (2024-06-07T10:14:30Z) - Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph
Propagation [68.13453771001522]
We propose a multimodal intensive ZSL framework that matches regions of images with corresponding semantic embeddings.
We conduct extensive experiments and evaluate our model on large-scale real-world data.
arXiv Detail & Related papers (2023-06-14T13:07:48Z) - A Closer Look at Knowledge Distillation with Features, Logits, and
Gradients [81.39206923719455]
Knowledge distillation (KD) is a substantial strategy for transferring learned knowledge from one neural network model to another.
This work provides a new perspective to motivate a set of knowledge distillation strategies by approximating the classical KL-divergence criteria with different knowledge sources.
Our analysis indicates that logits are generally a more efficient knowledge source and suggests that having sufficient feature dimensions is crucial for the model design.
arXiv Detail & Related papers (2022-03-18T21:26:55Z) - Deep Ensemble Collaborative Learning by using Knowledge-transfer Graph
for Fine-grained Object Classification [9.49864824780503]
The performance of ensembles of networks that have undergone mutual learning does not improve significantly from that of normal ensembles without mutual learning.
This may be due to the relationship between the knowledge in mutual learning and the individuality of the networks in the ensemble.
We propose an ensemble method using knowledge transfer to improve the accuracy of ensembles by introducing a loss design that promotes diversity among networks in mutual learning.
arXiv Detail & Related papers (2021-03-27T08:56:00Z) - Towards a Universal Continuous Knowledge Base [49.95342223987143]
We propose a method for building a continuous knowledge base that can store knowledge imported from multiple neural networks.
Experiments on text classification show promising results.
We import the knowledge from multiple models to the knowledge base, from which the fused knowledge is exported back to a single model.
arXiv Detail & Related papers (2020-12-25T12:27:44Z) - Towards Understanding Ensemble, Knowledge Distillation and
Self-Distillation in Deep Learning [93.18238573921629]
We study how Ensemble of deep learning models can improve test accuracy, and how the superior performance of ensemble can be distilled into a single model.
We show that ensemble/knowledge distillation in deep learning works very differently from traditional learning theory.
We prove that self-distillation can also be viewed as implicitly combining ensemble and knowledge distillation to improve test accuracy.
arXiv Detail & Related papers (2020-12-17T18:34:45Z) - Multi-level Knowledge Distillation [13.71183256776644]
We introduce Multi-level Knowledge Distillation (MLKD) to transfer richer representational knowledge from teacher to student networks.
MLKD employs three novel teacher-student similarities: individual similarity, relational similarity, and categorical similarity.
Experiments demonstrate that MLKD outperforms other state-of-the-art methods on both similar-architecture and cross-architecture tasks.
arXiv Detail & Related papers (2020-12-01T15:27:15Z) - Generative Adversarial Zero-Shot Relational Learning for Knowledge
Graphs [96.73259297063619]
We consider a novel formulation, zero-shot learning, to free this cumbersome curation.
For newly-added relations, we attempt to learn their semantic features from their text descriptions.
We leverage Generative Adrial Networks (GANs) to establish the connection between text and knowledge graph domain.
arXiv Detail & Related papers (2020-01-08T01:19:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.