MLPInit: Embarrassingly Simple GNN Training Acceleration with MLP
Initialization
- URL: http://arxiv.org/abs/2210.00102v3
- Date: Sat, 8 Apr 2023 05:54:15 GMT
- Title: MLPInit: Embarrassingly Simple GNN Training Acceleration with MLP
Initialization
- Authors: Xiaotian Han, Tong Zhao, Yozen Liu, Xia Hu, Neil Shah
- Abstract summary: Training graph neural networks (GNNs) on large graphs is complex and extremely time consuming.
We propose an embarrassingly simple, yet hugely effective method for GNN training acceleration, called PeerInit.
- Score: 51.76758674012744
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training graph neural networks (GNNs) on large graphs is complex and
extremely time consuming. This is attributed to overheads caused by sparse
matrix multiplication, which are sidestepped when training multi-layer
perceptrons (MLPs) with only node features. MLPs, by ignoring graph context,
are simple and faster for graph data, however they usually sacrifice prediction
accuracy, limiting their applications for graph data. We observe that for most
message passing-based GNNs, we can trivially derive an analog MLP (we call this
a PeerMLP) with an equivalent weight space, by setting the trainable parameters
with the same shapes, making us curious about \textbf{\emph{how do GNNs using
weights from a fully trained PeerMLP perform?}} Surprisingly, we find that GNNs
initialized with such weights significantly outperform their PeerMLPs,
motivating us to use PeerMLP training as a precursor, initialization step to
GNN training. To this end, we propose an embarrassingly simple, yet hugely
effective initialization method for GNN training acceleration, called MLPInit.
Our extensive experiments on multiple large-scale graph datasets with diverse
GNN architectures validate that MLPInit can accelerate the training of GNNs (up
to 33X speedup on OGB-Products) and often improve prediction performance (e.g.,
up to $7.97\%$ improvement for GraphSAGE across $7$ datasets for node
classification, and up to $17.81\%$ improvement across $4$ datasets for link
prediction on metric Hits@10). The code is available at
\href{https://github.com/snap-research/MLPInit-for-GNNs}.
Related papers
- VQGraph: Rethinking Graph Representation Space for Bridging GNNs and
MLPs [97.63412451659826]
VQGraph learns a structure-aware tokenizer on graph data that can encode each node's local substructure as a discrete code.
VQGraph achieves new state-of-the-art performance on GNN-to-MLP distillation in both transductive and inductive settings.
arXiv Detail & Related papers (2023-08-04T02:58:08Z) - Graph Ladling: Shockingly Simple Parallel GNN Training without
Intermediate Communication [100.51884192970499]
GNNs are a powerful family of neural networks for learning over graphs.
scaling GNNs either by deepening or widening suffers from prevalent issues of unhealthy gradients, over-smoothening, information squashing.
We propose not to deepen or widen current GNNs, but instead present a data-centric perspective of model soups tailored for GNNs.
arXiv Detail & Related papers (2023-06-18T03:33:46Z) - Graph Neural Networks are Inherently Good Generalizers: Insights by
Bridging GNNs and MLPs [71.93227401463199]
This paper pinpoints the major source of GNNs' performance gain to their intrinsic capability, by introducing an intermediate model class dubbed as P(ropagational)MLP.
We observe that PMLPs consistently perform on par with (or even exceed) their GNN counterparts, while being much more efficient in training.
arXiv Detail & Related papers (2022-12-18T08:17:32Z) - Teaching Yourself: Graph Self-Distillation on Neighborhood for Node
Classification [42.840122801915996]
We propose a Graph Self-Distillation on Neighborhood (GSDN) framework to reduce the gap between GNNs and Neurals.
GSDN infers 75XX faster than existing GNNs and 16X-25X faster than other inference acceleration methods.
arXiv Detail & Related papers (2022-10-05T08:35:34Z) - Graph-less Neural Networks: Teaching Old MLPs New Tricks via
Distillation [34.676755383361005]
Graph-less Neural Networks (GLNNs) have no inference graph dependency.
We show that GLNNs with competitive performance infer faster than GNNs by 146X-273X and faster than other acceleration methods by 14X-27X.
A comprehensive analysis of GLNN shows when and why GLNN can achieve competitive results to Gs and suggests GLNN as a handy choice for latency-constrained applications.
arXiv Detail & Related papers (2021-10-17T05:16:58Z) - A Unified Lottery Ticket Hypothesis for Graph Neural Networks [82.31087406264437]
We present a unified GNN sparsification (UGS) framework that simultaneously prunes the graph adjacency matrix and the model weights.
We further generalize the popular lottery ticket hypothesis to GNNs for the first time, by defining a graph lottery ticket (GLT) as a pair of core sub-dataset and sparse sub-network.
arXiv Detail & Related papers (2021-02-12T21:52:43Z) - GPT-GNN: Generative Pre-Training of Graph Neural Networks [93.35945182085948]
Graph neural networks (GNNs) have been demonstrated to be powerful in modeling graph-structured data.
We present the GPT-GNN framework to initialize GNNs by generative pre-training.
We show that GPT-GNN significantly outperforms state-of-the-art GNN models without pre-training by up to 9.1% across various downstream tasks.
arXiv Detail & Related papers (2020-06-27T20:12:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.