Accurate, Efficient and Scalable Training of Graph Neural Networks
- URL: http://arxiv.org/abs/2010.03166v1
- Date: Mon, 5 Oct 2020 22:06:23 GMT
- Title: Accurate, Efficient and Scalable Training of Graph Neural Networks
- Authors: Hanqing Zeng and Hongkuan Zhou and Ajitesh Srivastava and Rajgopal
Kannan and Viktor Prasanna
- Abstract summary: Graph Neural Networks (GNNs) are powerful deep learning models to generate node embeddings on graphs.
It is still challenging to perform training in an efficient and scalable way.
We propose a novel parallel training framework that reduces training workload by orders of magnitude compared with state-of-the-art minibatch methods.
- Score: 9.569918335816963
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph Neural Networks (GNNs) are powerful deep learning models to generate
node embeddings on graphs. When applying deep GNNs on large graphs, it is still
challenging to perform training in an efficient and scalable way. We propose a
novel parallel training framework. Through sampling small subgraphs as
minibatches, we reduce training workload by orders of magnitude compared with
state-of-the-art minibatch methods. We then parallelize the key computation
steps on tightly-coupled shared memory systems. For graph sampling, we exploit
parallelism within and across sampler instances, and propose an efficient data
structure supporting concurrent accesses from samplers. The parallel sampler
theoretically achieves near-linear speedup with respect to number of processing
units. For feature propagation within subgraphs, we improve cache utilization
and reduce DRAM traffic by data partitioning. Our partitioning is a
2-approximation strategy for minimizing the communication cost compared to the
optimal. We further develop a runtime scheduler to reorder the training
operations and adjust the minibatch subgraphs to improve parallel performance.
Finally, we generalize the above parallelization strategies to support multiple
types of GNN models and graph samplers. The proposed training outperforms the
state-of-the-art in scalability, efficiency and accuracy simultaneously. On a
40-core Xeon platform, we achieve 60x speedup (with AVX) in the sampling step
and 20x speedup in the feature propagation step, compared to the serial
implementation. Our algorithm enables fast training of deeper GNNs, as
demonstrated by orders of magnitude speedup compared to the Tensorflow
implementation. We open-source our code at
https://github.com/GraphSAINT/GraphSAINT.
Related papers
- Distributed Matrix-Based Sampling for Graph Neural Network Training [0.0]
We propose a matrix-based bulk sampling approach that expresses sampling as a sparse matrix multiplication (SpGEMM) and samples multiple minibatches at once.
When the input graph topology does not fit on a single device, our method distributes the graph and use communication-avoiding SpGEMM algorithms to scale GNN minibatch sampling.
In addition to new methods for sampling, we introduce a pipeline that uses our matrix-based bulk sampling approach to provide end-to-end training results.
arXiv Detail & Related papers (2023-11-06T06:40:43Z) - Efficient Heterogeneous Graph Learning via Random Projection [58.4138636866903]
Heterogeneous Graph Neural Networks (HGNNs) are powerful tools for deep learning on heterogeneous graphs.
Recent pre-computation-based HGNNs use one-time message passing to transform a heterogeneous graph into regular-shaped tensors.
We propose a hybrid pre-computation-based HGNN, named Random Projection Heterogeneous Graph Neural Network (RpHGNN)
arXiv Detail & Related papers (2023-10-23T01:25:44Z) - Communication-Free Distributed GNN Training with Vertex Cut [63.22674903170953]
CoFree-GNN is a novel distributed GNN training framework that significantly speeds up the training process by implementing communication-free training.
We demonstrate that CoFree-GNN speeds up the GNN training process by up to 10 times over the existing state-of-the-art GNN training approaches.
arXiv Detail & Related papers (2023-08-06T21:04:58Z) - GSplit: Scaling Graph Neural Network Training on Large Graphs via Split-Parallelism [6.3568605707961]
Mini-batch training is commonly used to train Graph neural networks (GNNs) on large graphs.
In this paper, we introduce a hybrid parallel mini-batch training paradigm called split parallelism.
We show that split parallelism outperforms state-of-the-art mini-batch training systems like DGL, Quiver, and $P3$.
arXiv Detail & Related papers (2023-03-24T03:28:05Z) - Scalable Graph Convolutional Network Training on Distributed-Memory
Systems [5.169989177779801]
Graph Convolutional Networks (GCNs) are extensively utilized for deep learning on graphs.
Since the convolution operation on graphs induces irregular memory access patterns, designing a memory- and communication-efficient parallel algorithm for GCN training poses unique challenges.
We propose a highly parallel training algorithm that scales to large processor counts.
arXiv Detail & Related papers (2022-12-09T17:51:13Z) - A Comprehensive Study on Large-Scale Graph Training: Benchmarking and
Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs)
We present a new ensembling training manner, named EnGCN, to address the existing issues.
Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and
Preprocessing [0.0]
Graph neural networks (GNNs) have extended the success of deep neural networks (DNNs) to non-Euclidean graph data.
Existing systems are inefficient to train large graphs with billions of nodes and edges with GPUs.
This paper proposes BGL, a distributed GNN training system designed to address the bottlenecks with a few key ideas.
arXiv Detail & Related papers (2021-12-16T00:37:37Z) - Accelerating Training and Inference of Graph Neural Networks with Fast
Sampling and Pipelining [58.10436813430554]
Mini-batch training of graph neural networks (GNNs) requires a lot of computation and data movement.
We argue in favor of performing mini-batch training with neighborhood sampling in a distributed multi-GPU environment.
We present a sequence of improvements to mitigate these bottlenecks, including a performance-engineered neighborhood sampler.
We also conduct an empirical analysis that supports the use of sampling for inference, showing that test accuracies are not materially compromised.
arXiv Detail & Related papers (2021-10-16T02:41:35Z) - Combining Label Propagation and Simple Models Out-performs Graph Neural
Networks [52.121819834353865]
We show that for many standard transductive node classification benchmarks, we can exceed or match the performance of state-of-the-art GNNs.
We call this overall procedure Correct and Smooth (C&S)
Our approach exceeds or nearly matches the performance of state-of-the-art GNNs on a wide variety of benchmarks.
arXiv Detail & Related papers (2020-10-27T02:10:52Z) - GraphACT: Accelerating GCN Training on CPU-FPGA Heterogeneous Platforms [1.2183405753834562]
Graph Convolutional Networks (GCNs) have emerged as the state-of-the-art deep learning model for representation learning on graphs.
It is challenging to accelerate training of GCNs due to substantial and irregular data communication.
We design a novel accelerator for training GCNs on CPU-FPGA heterogeneous systems.
arXiv Detail & Related papers (2019-12-31T21:19:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.