Related papers: Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study

Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study

URL: http://arxiv.org/abs/2108.10521v1
Date: Tue, 24 Aug 2021 05:00:37 GMT
Title: Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study
Authors: Tianlong Chen, Kaixiong Zhou, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang
Abstract summary: Training deep graph neural networks (GNNs) is notoriously hard. We present the first fair and reproducible benchmark dedicated to assessing the "tricks" of training deep GNNs.
Score: 100.27567794045045
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Training deep graph neural networks (GNNs) is notoriously hard. Besides the standard plights in training deep architectures such as vanishing gradients and overfitting, the training of deep GNNs also uniquely suffers from over-smoothing, information squashing, and so on, which limits their potential power on large-scale graphs. Although numerous efforts are proposed to address these limitations, such as various forms of skip connections, graph normalization, and random dropping, it is difficult to disentangle the advantages brought by a deep GNN architecture from those "tricks" necessary to train such an architecture. Moreover, the lack of a standardized benchmark with fair and consistent experimental settings poses an almost insurmountable obstacle to gauging the effectiveness of new mechanisms. In view of those, we present the first fair and reproducible benchmark dedicated to assessing the "tricks" of training deep GNNs. We categorize existing approaches, investigate their hyperparameter sensitivity, and unify the basic configuration. Comprehensive evaluations are then conducted on tens of representative graph datasets including the recent large-scale Open Graph Benchmark (OGB), with diverse deep GNN backbones. Based on synergistic studies, we discover the combo of superior training tricks, that lead us to attain the new state-of-the-art results for deep GCNs, across multiple representative graph datasets. We demonstrate that an organic combo of initial connection, identity mapping, group and batch normalization has the most ideal performance on large datasets. Experiments also reveal a number of "surprises" when combining or scaling up some of the tricks. All codes are available at https://github.com/VITA-Group/Deep_GCN_Benchmarking.

Related papers

Fast and Effective GNN Training through Sequences of Random Path Graphs [20.213843086649014]
We present GERN, a novel scalable framework for training GNNs in node classification tasks. Our method progressively refines the GNN weights on a sequence of random spanning trees suitably transformed into path graphs. The sparse nature of these path graphs substantially lightens the computational burden of GNN training.
arXiv Detail & Related papers (2023-06-07T23:12:42Z)
A Comprehensive Study on Large-Scale Graph Training: Benchmarking and Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs) We present a new ensembling training manner, named EnGCN, to address the existing issues. Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z)
Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs. Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs. Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z)
Meta Propagation Networks for Graph Few-shot Semi-supervised Learning [39.96930762034581]
We propose a novel network architecture equipped with a novel meta-learning algorithm to solve this problem. In essence, our framework Meta-PN infers high-quality pseudo labels on unlabeled nodes via a meta-learned label propagation strategy. Our approach offers easy and substantial performance gains compared to existing techniques on various benchmark datasets.
arXiv Detail & Related papers (2021-12-18T00:11:56Z)
Evaluating Deep Graph Neural Networks [27.902290204531326]
Graph Neural Networks (GNNs) have already been widely applied in various graph mining tasks. They suffer from the shallow architecture issue, which is the key impediment that hinders the model performance improvement. We present Deep Graph Multi-Layer Perceptron (DGMLP), a powerful approach that helps guide deep GNN designs.
arXiv Detail & Related papers (2021-08-02T14:55:10Z)
A Unified Lottery Ticket Hypothesis for Graph Neural Networks [82.31087406264437]
We present a unified GNN sparsification (UGS) framework that simultaneously prunes the graph adjacency matrix and the model weights. We further generalize the popular lottery ticket hypothesis to GNNs for the first time, by defining a graph lottery ticket (GLT) as a pair of core sub-dataset and sparse sub-network.
arXiv Detail & Related papers (2021-02-12T21:52:43Z)
Analyzing the Performance of Graph Neural Networks with Pipe Parallelism [2.269587850533721]
We focus on Graph Neural Networks (GNNs) that have found great success in tasks such as node or edge classification and link prediction. New approaches for processing larger networks are needed to advance graph techniques. We study how GNNs could be parallelized using existing tools and frameworks that are known to be successful in the deep learning community.
arXiv Detail & Related papers (2020-12-20T04:20:38Z)
Learning to Drop: Robust Graph Neural Network via Topological Denoising [50.81722989898142]
We propose PTDNet, a parameterized topological denoising network, to improve the robustness and generalization performance of Graph Neural Networks (GNNs) PTDNet prunes task-irrelevant edges by penalizing the number of edges in the sparsified graph with parameterized networks. We show that PTDNet can improve the performance of GNNs significantly and the performance gain becomes larger for more noisy datasets.
arXiv Detail & Related papers (2020-11-13T18:53:21Z)
Node Masking: Making Graph Neural Networks Generalize and Scale Better [71.51292866945471]
Graph Neural Networks (GNNs) have received a lot of interest in the recent times. In this paper, we utilize some theoretical tools to better visualize the operations performed by state of the art spatial GNNs. We introduce a simple concept, Node Masking, that allows them to generalize and scale better.
arXiv Detail & Related papers (2020-01-17T06:26:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.