Boosting Graph Neural Networks via Adaptive Knowledge Distillation
- URL: http://arxiv.org/abs/2210.05920v2
- Date: Wed, 5 Apr 2023 02:00:19 GMT
- Title: Boosting Graph Neural Networks via Adaptive Knowledge Distillation
- Authors: Zhichun Guo, Chunhui Zhang, Yujie Fan, Yijun Tian, Chuxu Zhang, Nitesh
Chawla
- Abstract summary: Graph neural networks (GNNs) have shown remarkable performance on diverse graph mining tasks.
Knowledge distillation (KD) is developed to combine the diverse knowledge from multiple models.
We propose a novel adaptive KD framework, called BGNN, which sequentially transfers knowledge from multiple GNNs into a student GNN.
- Score: 18.651451228086643
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph neural networks (GNNs) have shown remarkable performance on diverse
graph mining tasks. Although different GNNs can be unified as the same message
passing framework, they learn complementary knowledge from the same graph.
Knowledge distillation (KD) is developed to combine the diverse knowledge from
multiple models. It transfers knowledge from high-capacity teachers to a
lightweight student. However, to avoid oversmoothing, GNNs are often shallow,
which deviates from the setting of KD. In this context, we revisit KD by
separating its benefits from model compression and emphasizing its power of
transferring knowledge. To this end, we need to tackle two challenges: how to
transfer knowledge from compact teachers to a student with the same capacity;
and, how to exploit student GNN's own strength to learn knowledge. In this
paper, we propose a novel adaptive KD framework, called BGNN, which
sequentially transfers knowledge from multiple GNNs into a student GNN. We also
introduce an adaptive temperature module and a weight boosting module. These
modules guide the student to the appropriate knowledge for effective learning.
Extensive experiments have demonstrated the effectiveness of BGNN. In
particular, we achieve up to 3.05% improvement for node classification and
6.35% improvement for graph classification over vanilla GNNs.
Related papers
- IDEA: A Flexible Framework of Certified Unlearning for Graph Neural Networks [68.6374698896505]
Graph Neural Networks (GNNs) have been increasingly deployed in a plethora of applications.
Privacy leakage may happen when the trained GNNs are deployed and exposed to potential attackers.
We propose a principled framework named IDEA to achieve flexible and certified unlearning for GNNs.
arXiv Detail & Related papers (2024-07-28T04:59:59Z) - A Teacher-Free Graph Knowledge Distillation Framework with Dual
Self-Distillation [58.813991312803246]
We propose a Teacher-Free Graph Self-Distillation (TGS) framework that does not require any teacher model or GNNs during both training and inference.
TGS enjoys the benefits of graph topology awareness in training but is free from data dependency in inference.
arXiv Detail & Related papers (2024-03-06T05:52:13Z) - Label Deconvolution for Node Representation Learning on Large-scale
Attributed Graphs against Learning Bias [75.44877675117749]
We propose an efficient label regularization technique, namely Label Deconvolution (LD), to alleviate the learning bias by a novel and highly scalable approximation to the inverse mapping of GNNs.
Experiments demonstrate LD significantly outperforms state-of-the-art methods on Open Graph datasets Benchmark.
arXiv Detail & Related papers (2023-09-26T13:09:43Z) - Information Flow in Graph Neural Networks: A Clinical Triage Use Case [49.86931948849343]
Graph Neural Networks (GNNs) have gained popularity in healthcare and other domains due to their ability to process multi-modal and multi-relational graphs.
We investigate how the flow of embedding information within GNNs affects the prediction of links in Knowledge Graphs (KGs)
Our results demonstrate that incorporating domain knowledge into the GNN connectivity leads to better performance than using the same connectivity as the KG or allowing unconstrained embedding propagation.
arXiv Detail & Related papers (2023-09-12T09:18:12Z) - Shared Growth of Graph Neural Networks via Prompted Free-direction
Knowledge Distillation [39.35619721100205]
We propose the first Free-direction Knowledge Distillation framework via reinforcement learning for graph neural networks (GNNs)
Our core idea is to collaboratively learn two shallower GNNs to exchange knowledge between them.
Experiments on five benchmark datasets demonstrate our approaches outperform the base GNNs in a large margin.
arXiv Detail & Related papers (2023-07-02T10:03:01Z) - RELIANT: Fair Knowledge Distillation for Graph Neural Networks [39.22568244059485]
Graph Neural Networks (GNNs) have shown satisfying performance on various graph learning tasks.
Knowledge Distillation (KD) is a common solution to compress GNNs.
We propose a principled framework named RELIANT to mitigate the bias exhibited by the student model.
arXiv Detail & Related papers (2023-01-03T15:21:24Z) - FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks [31.980564414833175]
It is difficult to train a satisfactory teacher GNN due to the well-known over-parametrized and over-smoothing issues.
We propose the first Free-direction Knowledge Distillation framework via Reinforcement learning for GNNs, called FreeKD.
Our FreeKD is a general and principled framework which can be naturally compatible with GNNs of different architectures.
arXiv Detail & Related papers (2022-06-14T02:24:38Z) - Graph-Free Knowledge Distillation for Graph Neural Networks [30.38128029453977]
We propose the first dedicated approach to distilling knowledge from a graph neural network without graph data.
The proposed graph-free KD (GFKD) learns graph topology structures for knowledge transfer by modeling them with multinomial distribution.
We provide the strategies for handling different types of prior knowledge in the graph data or the GNNs.
arXiv Detail & Related papers (2021-05-16T21:38:24Z) - On Self-Distilling Graph Neural Network [64.00508355508106]
We propose the first teacher-free knowledge distillation method for GNNs, termed GNN Self-Distillation (GNN-SD)
The method is built upon the proposed neighborhood discrepancy rate (NDR), which quantifies the non-smoothness of the embedded graph in an efficient way.
We also summarize a generic GNN-SD framework that could be exploited to induce other distillation strategies.
arXiv Detail & Related papers (2020-11-04T12:29:33Z) - Distilling Knowledge from Graph Convolutional Networks [146.71503336770886]
Existing knowledge distillation methods focus on convolutional neural networks (CNNs)
We propose the first dedicated approach to distilling knowledge from a pre-trained graph convolutional network (GCN) model.
We show that our method achieves the state-of-the-art knowledge distillation performance for GCN models.
arXiv Detail & Related papers (2020-03-23T18:23:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.