Teaching Yourself: Graph Self-Distillation on Neighborhood for Node
Classification
- URL: http://arxiv.org/abs/2210.02097v5
- Date: Sun, 4 Jun 2023 14:49:21 GMT
- Title: Teaching Yourself: Graph Self-Distillation on Neighborhood for Node
Classification
- Authors: Lirong Wu, Jun Xia, Haitao Lin, Zhangyang Gao, Zicheng Liu, Guojiang
Zhao, Stan Z. Li
- Abstract summary: We propose a Graph Self-Distillation on Neighborhood (GSDN) framework to reduce the gap between GNNs and Neurals.
GSDN infers 75XX faster than existing GNNs and 16X-25X faster than other inference acceleration methods.
- Score: 42.840122801915996
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent years have witnessed great success in handling graph-related tasks
with Graph Neural Networks (GNNs). Despite their great academic success,
Multi-Layer Perceptrons (MLPs) remain the primary workhorse for practical
industrial applications. One reason for this academic-industrial gap is the
neighborhood-fetching latency incurred by data dependency in GNNs, which make
it hard to deploy for latency-sensitive applications that require fast
inference. Conversely, without involving any feature aggregation, MLPs have no
data dependency and infer much faster than GNNs, but their performance is less
competitive. Motivated by these complementary strengths and weaknesses, we
propose a Graph Self-Distillation on Neighborhood (GSDN) framework to reduce
the gap between GNNs and MLPs. Specifically, the GSDN framework is based purely
on MLPs, where structural information is only implicitly used as prior to guide
knowledge self-distillation between the neighborhood and the target,
substituting the explicit neighborhood information propagation as in GNNs. As a
result, GSDN enjoys the benefits of graph topology-awareness in training but
has no data dependency in inference. Extensive experiments have shown that the
performance of vanilla MLPs can be greatly improved with self-distillation,
e.g., GSDN improves over stand-alone MLPs by 15.54% on average and outperforms
the state-of-the-art GNNs on six datasets. Regarding inference speed, GSDN
infers 75X-89X faster than existing GNNs and 16X-25X faster than other
inference acceleration methods.
Related papers
- AdaGMLP: AdaBoosting GNN-to-MLP Knowledge Distillation [15.505402580010104]
A new wave of methods, collectively known as GNN-to-MLP Knowledge Distillation, has emerged.
They aim to transfer GNN-learned knowledge to a more efficient student.
These methods face challenges in situations with insufficient training data and incomplete test data.
We propose AdaGMLP, an AdaBoosting GNN-to-MLP Knowledge Distillation framework.
arXiv Detail & Related papers (2024-05-23T08:28:44Z) - A Teacher-Free Graph Knowledge Distillation Framework with Dual
Self-Distillation [58.813991312803246]
We propose a Teacher-Free Graph Self-Distillation (TGS) framework that does not require any teacher model or GNNs during both training and inference.
TGS enjoys the benefits of graph topology awareness in training but is free from data dependency in inference.
arXiv Detail & Related papers (2024-03-06T05:52:13Z) - VQGraph: Rethinking Graph Representation Space for Bridging GNNs and
MLPs [97.63412451659826]
VQGraph learns a structure-aware tokenizer on graph data that can encode each node's local substructure as a discrete code.
VQGraph achieves new state-of-the-art performance on GNN-to-MLP distillation in both transductive and inductive settings.
arXiv Detail & Related papers (2023-08-04T02:58:08Z) - Graph Neural Networks are Inherently Good Generalizers: Insights by
Bridging GNNs and MLPs [71.93227401463199]
This paper pinpoints the major source of GNNs' performance gain to their intrinsic capability, by introducing an intermediate model class dubbed as P(ropagational)MLP.
We observe that PMLPs consistently perform on par with (or even exceed) their GNN counterparts, while being much more efficient in training.
arXiv Detail & Related papers (2022-12-18T08:17:32Z) - MLPInit: Embarrassingly Simple GNN Training Acceleration with MLP
Initialization [51.76758674012744]
Training graph neural networks (GNNs) on large graphs is complex and extremely time consuming.
We propose an embarrassingly simple, yet hugely effective method for GNN training acceleration, called PeerInit.
arXiv Detail & Related papers (2022-09-30T21:33:51Z) - Graph-less Neural Networks: Teaching Old MLPs New Tricks via
Distillation [34.676755383361005]
Graph-less Neural Networks (GLNNs) have no inference graph dependency.
We show that GLNNs with competitive performance infer faster than GNNs by 146X-273X and faster than other acceleration methods by 14X-27X.
A comprehensive analysis of GLNN shows when and why GLNN can achieve competitive results to Gs and suggests GLNN as a handy choice for latency-constrained applications.
arXiv Detail & Related papers (2021-10-17T05:16:58Z) - Optimization of Graph Neural Networks: Implicit Acceleration by Skip
Connections and More Depth [57.10183643449905]
Graph Neural Networks (GNNs) have been studied from the lens of expressive power and generalization.
We study the dynamics of GNNs by studying deep skip optimization.
Our results provide first theoretical support for the success of GNNs.
arXiv Detail & Related papers (2021-05-10T17:59:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.