Related papers: Training Graph Neural Networks with 1000 Layers

Training Graph Neural Networks with 1000 Layers

URL: http://arxiv.org/abs/2106.07476v2
Date: Thu, 17 Jun 2021 03:26:23 GMT
Title: Training Graph Neural Networks with 1000 Layers
Authors: Guohao Li, Matthias M\"uller, Bernard Ghanem, Vladlen Koltun
Abstract summary: We study reversible connections, group convolutions, weight tying, and equilibrium models to advance the memory and parameter efficiency of GNNs. To the best of our knowledge, RevGNN-Deep is the deepest GNN in the literature by one order of magnitude.
Score: 133.84813995275988
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep graph neural networks (GNNs) have achieved excellent results on various tasks on increasingly large graph datasets with millions of nodes and edges. However, memory complexity has become a major obstacle when training deep GNNs for practical applications due to the immense number of nodes, edges, and intermediate activations. To improve the scalability of GNNs, prior works propose smart graph sampling or partitioning strategies to train GNNs with a smaller set of nodes or sub-graphs. In this work, we study reversible connections, group convolutions, weight tying, and equilibrium models to advance the memory and parameter efficiency of GNNs. We find that reversible connections in combination with deep network architectures enable the training of overparameterized GNNs that significantly outperform existing methods on multiple datasets. Our models RevGNN-Deep (1001 layers with 80 channels each) and RevGNN-Wide (448 layers with 224 channels each) were both trained on a single commodity GPU and achieve an ROC-AUC of $87.74 \pm 0.13$ and $88.24 \pm 0.15$ on the ogbn-proteins dataset. To the best of our knowledge, RevGNN-Deep is the deepest GNN in the literature by one order of magnitude. Please visit our project website https://www.deepgcns.org/arch/gnn1000 for more information.

Related papers

Graph Neural Networks Need Cluster-Normalize-Activate Modules [19.866482154218374]
Graph Neural Networks (GNNs) are non-Euclidean deep learning models for graph-structured data. We propose a plug-and-play module consisting of three steps: Cluster-Normalize-Activate (CNA) CNA significantly improves the accuracy over the state-of-the-art in node classification and property prediction tasks.
arXiv Detail & Related papers (2024-12-05T10:59:20Z)
Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication [100.51884192970499]
GNNs are a powerful family of neural networks for learning over graphs. scaling GNNs either by deepening or widening suffers from prevalent issues of unhealthy gradients, over-smoothening, information squashing. We propose not to deepen or widen current GNNs, but instead present a data-centric perspective of model soups tailored for GNNs.
arXiv Detail & Related papers (2023-06-18T03:33:46Z)
Do Not Train It: A Linear Neural Architecture Search of Graph Neural Networks [15.823247346294089]
We develop a novel NAS-GNNs method, namely neural architecture coding (NAC) Our approach leads to state-of-the-art performance, which is up to $200times$ faster and $18.8%$ more accurate than the strong baselines.
arXiv Detail & Related papers (2023-05-23T13:44:04Z)
Graph Neural Network for Accurate and Low-complexity SAR ATR [2.9766397696234996]
We propose a graph neural network (GNN) model to achieve accurate and low-latency SAR ATR. The proposed GNN model has low computation complexity and achieves comparable high accuracy. Compared with the state-of-the-art CNNs, the proposed GNN model has only 1/3000 computation cost and 1/80 model size.
arXiv Detail & Related papers (2023-05-11T20:17:41Z)
Distributed Graph Neural Network Training: A Survey [51.77035975191926]
Graph neural networks (GNNs) are a type of deep learning models that are trained on graphs and have been successfully applied in various domains. Despite the effectiveness of GNNs, it is still challenging for GNNs to efficiently scale to large graphs. As a remedy, distributed computing becomes a promising solution of training large-scale GNNs.
arXiv Detail & Related papers (2022-11-01T01:57:00Z)
Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors to Sequences [55.329402218608365]
We propose the Neighbor2Seq to transform the hierarchical neighborhood of each node into a sequence. We evaluate our method on a massive graph with more than 111 million nodes and 1.6 billion edges. Results show that our proposed method is scalable to massive graphs and achieves superior performance across massive and medium-scale graphs.
arXiv Detail & Related papers (2022-02-07T16:38:36Z)
Network In Graph Neural Network [9.951298152023691]
We present a model-agnostic methodology that allows arbitrary GNN models to increase their model capacity by making the model deeper. Instead of adding or widening GNN layers, NGNN deepens a GNN model by inserting non-linear feedforward neural network layer(s) within each GNN layer.
arXiv Detail & Related papers (2021-11-23T03:58:56Z)
Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study [100.27567794045045]
Training deep graph neural networks (GNNs) is notoriously hard. We present the first fair and reproducible benchmark dedicated to assessing the "tricks" of training deep GNNs.
arXiv Detail & Related papers (2021-08-24T05:00:37Z)
AdaGNN: A multi-modal latent representation meta-learner for GNNs based on AdaBoosting [0.38073142980733]
Graph Neural Networks (GNNs) focus on extracting intrinsic network features. We propose boosting-based meta learner for GNNs. AdaGNN performs exceptionally well for applications with rich and diverse node neighborhood information.
arXiv Detail & Related papers (2021-08-14T03:07:26Z)
A Unified Lottery Ticket Hypothesis for Graph Neural Networks [82.31087406264437]
We present a unified GNN sparsification (UGS) framework that simultaneously prunes the graph adjacency matrix and the model weights. We further generalize the popular lottery ticket hypothesis to GNNs for the first time, by defining a graph lottery ticket (GLT) as a pair of core sub-dataset and sparse sub-network.
arXiv Detail & Related papers (2021-02-12T21:52:43Z)
Reducing Communication in Graph Neural Network Training [0.0]
Graph Neural Networks (GNNs) are powerful and flexible neural networks that use the naturally sparse connectivity information of the data. We introduce a family of parallel algorithms for training GNNs and show that they canally reduce communication compared to previous parallel GNN training methods.
arXiv Detail & Related papers (2020-05-07T07:45:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.