Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive
Benchmark Study
- URL: http://arxiv.org/abs/2108.10521v1
- Date: Tue, 24 Aug 2021 05:00:37 GMT
- Title: Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive
Benchmark Study
- Authors: Tianlong Chen, Kaixiong Zhou, Keyu Duan, Wenqing Zheng, Peihao Wang,
Xia Hu, Zhangyang Wang
- Abstract summary: Training deep graph neural networks (GNNs) is notoriously hard.
We present the first fair and reproducible benchmark dedicated to assessing the "tricks" of training deep GNNs.
- Score: 100.27567794045045
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training deep graph neural networks (GNNs) is notoriously hard. Besides the
standard plights in training deep architectures such as vanishing gradients and
overfitting, the training of deep GNNs also uniquely suffers from
over-smoothing, information squashing, and so on, which limits their potential
power on large-scale graphs. Although numerous efforts are proposed to address
these limitations, such as various forms of skip connections, graph
normalization, and random dropping, it is difficult to disentangle the
advantages brought by a deep GNN architecture from those "tricks" necessary to
train such an architecture. Moreover, the lack of a standardized benchmark with
fair and consistent experimental settings poses an almost insurmountable
obstacle to gauging the effectiveness of new mechanisms. In view of those, we
present the first fair and reproducible benchmark dedicated to assessing the
"tricks" of training deep GNNs. We categorize existing approaches, investigate
their hyperparameter sensitivity, and unify the basic configuration.
Comprehensive evaluations are then conducted on tens of representative graph
datasets including the recent large-scale Open Graph Benchmark (OGB), with
diverse deep GNN backbones. Based on synergistic studies, we discover the combo
of superior training tricks, that lead us to attain the new state-of-the-art
results for deep GCNs, across multiple representative graph datasets. We
demonstrate that an organic combo of initial connection, identity mapping,
group and batch normalization has the most ideal performance on large datasets.
Experiments also reveal a number of "surprises" when combining or scaling up
some of the tricks. All codes are available at
https://github.com/VITA-Group/Deep_GCN_Benchmarking.
Related papers
- A Comprehensive Study on Large-Scale Graph Training: Benchmarking and
Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs)
We present a new ensembling training manner, named EnGCN, to address the existing issues.
Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z) - Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural
Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs.
Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs.
Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z) - Meta Propagation Networks for Graph Few-shot Semi-supervised Learning [39.96930762034581]
We propose a novel network architecture equipped with a novel meta-learning algorithm to solve this problem.
In essence, our framework Meta-PN infers high-quality pseudo labels on unlabeled nodes via a meta-learned label propagation strategy.
Our approach offers easy and substantial performance gains compared to existing techniques on various benchmark datasets.
arXiv Detail & Related papers (2021-12-18T00:11:56Z) - Evaluating Deep Graph Neural Networks [27.902290204531326]
Graph Neural Networks (GNNs) have already been widely applied in various graph mining tasks.
They suffer from the shallow architecture issue, which is the key impediment that hinders the model performance improvement.
We present Deep Graph Multi-Layer Perceptron (DGMLP), a powerful approach that helps guide deep GNN designs.
arXiv Detail & Related papers (2021-08-02T14:55:10Z) - A Unified Lottery Ticket Hypothesis for Graph Neural Networks [82.31087406264437]
We present a unified GNN sparsification (UGS) framework that simultaneously prunes the graph adjacency matrix and the model weights.
We further generalize the popular lottery ticket hypothesis to GNNs for the first time, by defining a graph lottery ticket (GLT) as a pair of core sub-dataset and sparse sub-network.
arXiv Detail & Related papers (2021-02-12T21:52:43Z) - Analyzing the Performance of Graph Neural Networks with Pipe Parallelism [2.269587850533721]
We focus on Graph Neural Networks (GNNs) that have found great success in tasks such as node or edge classification and link prediction.
New approaches for processing larger networks are needed to advance graph techniques.
We study how GNNs could be parallelized using existing tools and frameworks that are known to be successful in the deep learning community.
arXiv Detail & Related papers (2020-12-20T04:20:38Z) - Learning to Drop: Robust Graph Neural Network via Topological Denoising [50.81722989898142]
We propose PTDNet, a parameterized topological denoising network, to improve the robustness and generalization performance of Graph Neural Networks (GNNs)
PTDNet prunes task-irrelevant edges by penalizing the number of edges in the sparsified graph with parameterized networks.
We show that PTDNet can improve the performance of GNNs significantly and the performance gain becomes larger for more noisy datasets.
arXiv Detail & Related papers (2020-11-13T18:53:21Z) - Node Masking: Making Graph Neural Networks Generalize and Scale Better [71.51292866945471]
Graph Neural Networks (GNNs) have received a lot of interest in the recent times.
In this paper, we utilize some theoretical tools to better visualize the operations performed by state of the art spatial GNNs.
We introduce a simple concept, Node Masking, that allows them to generalize and scale better.
arXiv Detail & Related papers (2020-01-17T06:26:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.