Extract the Knowledge of Graph Neural Networks and Go Beyond it: An
Effective Knowledge Distillation Framework
- URL: http://arxiv.org/abs/2103.02885v1
- Date: Thu, 4 Mar 2021 08:13:55 GMT
- Title: Extract the Knowledge of Graph Neural Networks and Go Beyond it: An
Effective Knowledge Distillation Framework
- Authors: Cheng Yang, Jiawei Liu and Chuan Shi
- Abstract summary: We propose a framework based on knowledge distillation to address the issues of semi-supervised learning on graphs.
Our framework extracts the knowledge of an arbitrary learned GNN model (teacher model) and injects it into a well-designed student model.
Experimental results show that the learned student model can consistently outperform its corresponding teacher model by 1.4% - 4.7% on average.
- Score: 42.57467126227328
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semi-supervised learning on graphs is an important problem in the machine
learning area. In recent years, state-of-the-art classification methods based
on graph neural networks (GNNs) have shown their superiority over traditional
ones such as label propagation. However, the sophisticated architectures of
these neural models will lead to a complex prediction mechanism, which could
not make full use of valuable prior knowledge lying in the data, e.g.,
structurally correlated nodes tend to have the same class. In this paper, we
propose a framework based on knowledge distillation to address the above
issues. Our framework extracts the knowledge of an arbitrary learned GNN model
(teacher model), and injects it into a well-designed student model. The student
model is built with two simple prediction mechanisms, i.e., label propagation
and feature transformation, which naturally preserves structure-based and
feature-based prior knowledge, respectively. In specific, we design the student
model as a trainable combination of parameterized label propagation and feature
transformation modules. As a result, the learned student can benefit from both
prior knowledge and the knowledge in GNN teachers for more effective
predictions. Moreover, the learned student model has a more interpretable
prediction process than GNNs. We conduct experiments on five public benchmark
datasets and employ seven GNN models including GCN, GAT, APPNP, SAGE, SGC,
GCNII and GLP as the teacher models. Experimental results show that the learned
student model can consistently outperform its corresponding teacher model by
1.4% - 4.7% on average. Code and data are available at
https://github.com/BUPT-GAMMA/CPF
Related papers
- A Teacher-Free Graph Knowledge Distillation Framework with Dual
Self-Distillation [58.813991312803246]
We propose a Teacher-Free Graph Self-Distillation (TGS) framework that does not require any teacher model or GNNs during both training and inference.
TGS enjoys the benefits of graph topology awareness in training but is free from data dependency in inference.
arXiv Detail & Related papers (2024-03-06T05:52:13Z) - Frameless Graph Knowledge Distillation [27.831929635701886]
We show how the graph knowledge supplied by the teacher is learned and digested by the student model via both algebra and geometry.
Our proposed model can generate learning accuracy identical to or even surpass the teacher model while maintaining the high speed of inference.
arXiv Detail & Related papers (2023-07-13T08:56:50Z) - Geometric Knowledge Distillation: Topology Compression for Graph Neural
Networks [80.8446673089281]
We study a new paradigm of knowledge transfer that aims at encoding graph topological information into graph neural networks (GNNs)
We propose Neural Heat Kernel (NHK) to encapsulate the geometric property of the underlying manifold concerning the architecture of GNNs.
A fundamental and principled solution is derived by aligning NHKs on teacher and student models, dubbed as Geometric Knowledge Distillation.
arXiv Detail & Related papers (2022-10-24T08:01:58Z) - Compressing Deep Graph Neural Networks via Adversarial Knowledge
Distillation [41.00398052556643]
We propose a novel Adversarial Knowledge Distillation framework for graph models named GraphAKD.
The discriminator distinguishes between teacher knowledge and what the student inherits, while the student GNN works as a generator and aims to fool the discriminator.
The results imply that GraphAKD can precisely transfer knowledge from a complicated teacher GNN to a compact student GNN.
arXiv Detail & Related papers (2022-05-24T00:04:43Z) - Data-Free Adversarial Knowledge Distillation for Graph Neural Networks [62.71646916191515]
We propose the first end-to-end framework for data-free adversarial knowledge distillation on graph structured data (DFAD-GNN)
To be specific, our DFAD-GNN employs a generative adversarial network, which mainly consists of three components: a pre-trained teacher model and a student model are regarded as two discriminators, and a generator is utilized for deriving training graphs to distill knowledge from the teacher model into the student model.
Our DFAD-GNN significantly surpasses state-of-the-art data-free baselines in the graph classification task.
arXiv Detail & Related papers (2022-05-08T08:19:40Z) - Streaming Graph Neural Networks via Continual Learning [31.810308087441445]
Graph neural networks (GNNs) have achieved strong performance in various applications.
In this paper, we propose a streaming GNN model based on continual learning.
We show that our model can efficiently update model parameters and achieve comparable performance to model retraining.
arXiv Detail & Related papers (2020-09-23T06:52:30Z) - GPT-GNN: Generative Pre-Training of Graph Neural Networks [93.35945182085948]
Graph neural networks (GNNs) have been demonstrated to be powerful in modeling graph-structured data.
We present the GPT-GNN framework to initialize GNNs by generative pre-training.
We show that GPT-GNN significantly outperforms state-of-the-art GNN models without pre-training by up to 9.1% across various downstream tasks.
arXiv Detail & Related papers (2020-06-27T20:12:33Z) - Distilling Knowledge from Graph Convolutional Networks [146.71503336770886]
Existing knowledge distillation methods focus on convolutional neural networks (CNNs)
We propose the first dedicated approach to distilling knowledge from a pre-trained graph convolutional network (GCN) model.
We show that our method achieves the state-of-the-art knowledge distillation performance for GCN models.
arXiv Detail & Related papers (2020-03-23T18:23:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.