Shared Growth of Graph Neural Networks via Prompted Free-direction
Knowledge Distillation
- URL: http://arxiv.org/abs/2307.00534v3
- Date: Thu, 16 Nov 2023 15:22:45 GMT
- Title: Shared Growth of Graph Neural Networks via Prompted Free-direction
Knowledge Distillation
- Authors: Kaituo Feng, Yikun Miao, Changsheng Li, Ye Yuan, Guoren Wang
- Abstract summary: We propose the first Free-direction Knowledge Distillation framework via reinforcement learning for graph neural networks (GNNs)
Our core idea is to collaboratively learn two shallower GNNs to exchange knowledge between them.
Experiments on five benchmark datasets demonstrate our approaches outperform the base GNNs in a large margin.
- Score: 39.35619721100205
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge distillation (KD) has shown to be effective to boost the
performance of graph neural networks (GNNs), where the typical objective is to
distill knowledge from a deeper teacher GNN into a shallower student GNN.
However, it is often quite challenging to train a satisfactory deeper GNN due
to the well-known over-parametrized and over-smoothing issues, leading to
invalid knowledge transfer in practical applications. In this paper, we propose
the first Free-direction Knowledge Distillation framework via reinforcement
learning for GNNs, called FreeKD, which is no longer required to provide a
deeper well-optimized teacher GNN. Our core idea is to collaboratively learn
two shallower GNNs to exchange knowledge between them. As we observe that one
typical GNN model often exhibits better and worse performances at different
nodes during training, we devise a dynamic and free-direction knowledge
transfer strategy that involves two levels of actions: 1) node-level action
determines the directions of knowledge transfer between the corresponding nodes
of two networks; and then 2) structure-level action determines which of the
local structures generated by the node-level actions to be propagated.
Additionally, considering that different augmented graphs can potentially
capture distinct perspectives of the graph data, we propose FreeKD-Prompt that
learns undistorted and diverse augmentations based on prompt learning for
exchanging varied knowledge. Furthermore, instead of confining knowledge
exchange within two GNNs, we develop FreeKD++ to enable free-direction
knowledge transfer among multiple GNNs. Extensive experiments on five benchmark
datasets demonstrate our approaches outperform the base GNNs in a large margin.
More surprisingly, our FreeKD has comparable or even better performance than
traditional KD algorithms that distill knowledge from a deeper and stronger
teacher GNN.
Related papers
- Collaborative Knowledge Distillation via a Learning-by-Education Node Community [19.54023115706067]
Learning-by-Education Node Community framework (LENC) for Collaborative Knowledge Distillation (CKD) is presented.
LENC addresses the challenges of handling diverse training data distributions and the limitations of individual Deep Neural Network (DNN) node learning abilities.
It achieves state-of-the-art performance in on-line unlabelled CKD.
arXiv Detail & Related papers (2024-09-30T14:22:28Z) - A Teacher-Free Graph Knowledge Distillation Framework with Dual
Self-Distillation [58.813991312803246]
We propose a Teacher-Free Graph Self-Distillation (TGS) framework that does not require any teacher model or GNNs during both training and inference.
TGS enjoys the benefits of graph topology awareness in training but is free from data dependency in inference.
arXiv Detail & Related papers (2024-03-06T05:52:13Z) - Information Flow in Graph Neural Networks: A Clinical Triage Use Case [49.86931948849343]
Graph Neural Networks (GNNs) have gained popularity in healthcare and other domains due to their ability to process multi-modal and multi-relational graphs.
We investigate how the flow of embedding information within GNNs affects the prediction of links in Knowledge Graphs (KGs)
Our results demonstrate that incorporating domain knowledge into the GNN connectivity leads to better performance than using the same connectivity as the KG or allowing unconstrained embedding propagation.
arXiv Detail & Related papers (2023-09-12T09:18:12Z) - Hierarchical Contrastive Learning Enhanced Heterogeneous Graph Neural
Network [59.860534520941485]
Heterogeneous graph neural networks (HGNNs) as an emerging technique have shown superior capacity of dealing with heterogeneous information network (HIN)
Recently, contrastive learning, a self-supervised method, becomes one of the most exciting learning paradigms and shows great potential when there are no labels.
In this paper, we study the problem of self-supervised HGNNs and propose a novel co-contrastive learning mechanism for HGNNs, named HeCo.
arXiv Detail & Related papers (2023-04-24T16:17:21Z) - Boosting Graph Neural Networks via Adaptive Knowledge Distillation [18.651451228086643]
Graph neural networks (GNNs) have shown remarkable performance on diverse graph mining tasks.
Knowledge distillation (KD) is developed to combine the diverse knowledge from multiple models.
We propose a novel adaptive KD framework, called BGNN, which sequentially transfers knowledge from multiple GNNs into a student GNN.
arXiv Detail & Related papers (2022-10-12T04:48:50Z) - FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks [31.980564414833175]
It is difficult to train a satisfactory teacher GNN due to the well-known over-parametrized and over-smoothing issues.
We propose the first Free-direction Knowledge Distillation framework via Reinforcement learning for GNNs, called FreeKD.
Our FreeKD is a general and principled framework which can be naturally compatible with GNNs of different architectures.
arXiv Detail & Related papers (2022-06-14T02:24:38Z) - AKE-GNN: Effective Graph Learning with Adaptive Knowledge Exchange [14.919474099848816]
Graph Neural Networks (GNNs) have already been widely used in various graph mining tasks.
Recent works reveal that the learned weights (channels) in well-trained GNNs are highly redundant, which limits the performance of GNNs.
We introduce a novel GNN learning framework named AKE-GNN, which performs the Adaptive Knowledge Exchange strategy.
arXiv Detail & Related papers (2021-06-10T02:00:26Z) - Node2Seq: Towards Trainable Convolutions in Graph Neural Networks [59.378148590027735]
We propose a graph network layer, known as Node2Seq, to learn node embeddings with explicitly trainable weights for different neighboring nodes.
For a target node, our method sorts its neighboring nodes via attention mechanism and then employs 1D convolutional neural networks (CNNs) to enable explicit weights for information aggregation.
In addition, we propose to incorporate non-local information for feature learning in an adaptive manner based on the attention scores.
arXiv Detail & Related papers (2021-01-06T03:05:37Z) - On Self-Distilling Graph Neural Network [64.00508355508106]
We propose the first teacher-free knowledge distillation method for GNNs, termed GNN Self-Distillation (GNN-SD)
The method is built upon the proposed neighborhood discrepancy rate (NDR), which quantifies the non-smoothness of the embedded graph in an efficient way.
We also summarize a generic GNN-SD framework that could be exploited to induce other distillation strategies.
arXiv Detail & Related papers (2020-11-04T12:29:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.