Frameless Graph Knowledge Distillation
- URL: http://arxiv.org/abs/2307.06631v1
- Date: Thu, 13 Jul 2023 08:56:50 GMT
- Title: Frameless Graph Knowledge Distillation
- Authors: Dai Shi, Zhiqi Shao, Yi Guo, Junbin Gao
- Abstract summary: We show how the graph knowledge supplied by the teacher is learned and digested by the student model via both algebra and geometry.
Our proposed model can generate learning accuracy identical to or even surpass the teacher model while maintaining the high speed of inference.
- Score: 27.831929635701886
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Knowledge distillation (KD) has shown great potential for transferring
knowledge from a complex teacher model to a simple student model in which the
heavy learning task can be accomplished efficiently and without losing too much
prediction accuracy. Recently, many attempts have been made by applying the KD
mechanism to the graph representation learning models such as graph neural
networks (GNNs) to accelerate the model's inference speed via student models.
However, many existing KD-based GNNs utilize MLP as a universal approximator in
the student model to imitate the teacher model's process without considering
the graph knowledge from the teacher model. In this work, we provide a KD-based
framework on multi-scaled GNNs, known as graph framelet, and prove that by
adequately utilizing the graph knowledge in a multi-scaled manner provided by
graph framelet decomposition, the student model is capable of adapting both
homophilic and heterophilic graphs and has the potential of alleviating the
over-squashing issue with a simple yet effectively graph surgery. Furthermore,
we show how the graph knowledge supplied by the teacher is learned and digested
by the student model via both algebra and geometry. Comprehensive experiments
show that our proposed model can generate learning accuracy identical to or
even surpass the teacher model while maintaining the high speed of inference.
Related papers
- Disentangled Generative Graph Representation Learning [51.59824683232925]
This paper introduces DiGGR (Disentangled Generative Graph Representation Learning), a self-supervised learning framework.
It aims to learn latent disentangled factors and utilize them to guide graph mask modeling.
Experiments on 11 public datasets for two different graph learning tasks demonstrate that DiGGR consistently outperforms many previous self-supervised methods.
arXiv Detail & Related papers (2024-08-24T05:13:02Z) - Distilling Knowledge from Self-Supervised Teacher by Embedding Graph
Alignment [52.704331909850026]
We formulate a new knowledge distillation framework to transfer the knowledge from self-supervised pre-trained models to any other student network.
Inspired by the spirit of instance discrimination in self-supervised learning, we model the instance-instance relations by a graph formulation in the feature embedding space.
Our distillation scheme can be flexibly applied to transfer the self-supervised knowledge to enhance representation learning on various student networks.
arXiv Detail & Related papers (2022-11-23T19:27:48Z) - Compressing Deep Graph Neural Networks via Adversarial Knowledge
Distillation [41.00398052556643]
We propose a novel Adversarial Knowledge Distillation framework for graph models named GraphAKD.
The discriminator distinguishes between teacher knowledge and what the student inherits, while the student GNN works as a generator and aims to fool the discriminator.
The results imply that GraphAKD can precisely transfer knowledge from a complicated teacher GNN to a compact student GNN.
arXiv Detail & Related papers (2022-05-24T00:04:43Z) - Data-Free Adversarial Knowledge Distillation for Graph Neural Networks [62.71646916191515]
We propose the first end-to-end framework for data-free adversarial knowledge distillation on graph structured data (DFAD-GNN)
To be specific, our DFAD-GNN employs a generative adversarial network, which mainly consists of three components: a pre-trained teacher model and a student model are regarded as two discriminators, and a generator is utilized for deriving training graphs to distill knowledge from the teacher model into the student model.
Our DFAD-GNN significantly surpasses state-of-the-art data-free baselines in the graph classification task.
arXiv Detail & Related papers (2022-05-08T08:19:40Z) - Graph Self-supervised Learning with Accurate Discrepancy Learning [64.69095775258164]
We propose a framework that aims to learn the exact discrepancy between the original and the perturbed graphs, coined as Discrepancy-based Self-supervised LeArning (D-SLA)
We validate our method on various graph-related downstream tasks, including molecular property prediction, protein function prediction, and link prediction tasks, on which our model largely outperforms relevant baselines.
arXiv Detail & Related papers (2022-02-07T08:04:59Z) - Extract the Knowledge of Graph Neural Networks and Go Beyond it: An
Effective Knowledge Distillation Framework [42.57467126227328]
We propose a framework based on knowledge distillation to address the issues of semi-supervised learning on graphs.
Our framework extracts the knowledge of an arbitrary learned GNN model (teacher model) and injects it into a well-designed student model.
Experimental results show that the learned student model can consistently outperform its corresponding teacher model by 1.4% - 4.7% on average.
arXiv Detail & Related papers (2021-03-04T08:13:55Z) - Tucker decomposition-based Temporal Knowledge Graph Completion [35.56360622521721]
We build a new tensor decomposition model for temporal knowledge graphs completion inspired by the Tucker decomposition of order 4 tensor.
We demonstrate that the proposed model is fully expressive and report state-of-the-art results for several public benchmarks.
arXiv Detail & Related papers (2020-11-16T07:05:52Z) - Iterative Graph Self-Distillation [161.04351580382078]
We propose a novel unsupervised graph learning paradigm called Iterative Graph Self-Distillation (IGSD)
IGSD iteratively performs the teacher-student distillation with graph augmentations.
We show that we achieve significant and consistent performance gain on various graph datasets in both unsupervised and semi-supervised settings.
arXiv Detail & Related papers (2020-10-23T18:37:06Z) - Distilling Knowledge from Graph Convolutional Networks [146.71503336770886]
Existing knowledge distillation methods focus on convolutional neural networks (CNNs)
We propose the first dedicated approach to distilling knowledge from a pre-trained graph convolutional network (GCN) model.
We show that our method achieves the state-of-the-art knowledge distillation performance for GCN models.
arXiv Detail & Related papers (2020-03-23T18:23:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.