BatchGNN: Efficient CPU-Based Distributed GNN Training on Very Large
Graphs
- URL: http://arxiv.org/abs/2306.13814v1
- Date: Fri, 23 Jun 2023 23:25:34 GMT
- Title: BatchGNN: Efficient CPU-Based Distributed GNN Training on Very Large
Graphs
- Authors: Loc Hoang, Rita Brugarolas Brufau, Ke Ding, Bo Wu
- Abstract summary: BatchGNN is a distributed CPU system that showcases techniques to efficiently train GNNs on terabyte-sized graphs.
BatchGNN achieves an average $3times$ speedup over DistDGL on three GNN models trained on OGBN graphs.
- Score: 2.984386665258243
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present BatchGNN, a distributed CPU system that showcases techniques that
can be used to efficiently train GNNs on terabyte-sized graphs. It reduces
communication overhead with macrobatching in which multiple minibatches'
subgraph sampling and feature fetching are batched into one communication relay
to reduce redundant feature fetches when input features are static. BatchGNN
provides integrated graph partitioning and native GNN layer implementations to
improve runtime, and it can cache aggregated input features to further reduce
sampling overhead. BatchGNN achieves an average $3\times$ speedup over DistDGL
on three GNN models trained on OGBN graphs, outperforms the runtimes reported
by distributed GPU systems $P^3$ and DistDGLv2, and scales to a terabyte-sized
graph.
Related papers
- LSM-GNN: Large-scale Storage-based Multi-GPU GNN Training by Optimizing Data Transfer Scheme [12.64360444043247]
Graph Neural Networks (GNNs) are widely used today in recommendation systems, fraud detection, and node/link classification tasks.
To address limited memory capacities, traditional GNN training approaches use graph partitioning and sharding techniques.
We propose Large-scale Storage-based Multi- GPU GNN framework (LSM-GNN)
LSM-GNN incorporates a hybrid eviction policy that intelligently manages cache space by using both static and dynamic node information.
arXiv Detail & Related papers (2024-07-21T20:41:39Z) - Spatio-Spectral Graph Neural Networks [50.277959544420455]
We propose Spatio-Spectral Graph Networks (S$2$GNNs)
S$2$GNNs combine spatially and spectrally parametrized graph filters.
We show that S$2$GNNs vanquish over-squashing and yield strictly tighter approximation-theoretic error bounds than MPGNNs.
arXiv Detail & Related papers (2024-05-29T14:28:08Z) - Communication-Free Distributed GNN Training with Vertex Cut [63.22674903170953]
CoFree-GNN is a novel distributed GNN training framework that significantly speeds up the training process by implementing communication-free training.
We demonstrate that CoFree-GNN speeds up the GNN training process by up to 10 times over the existing state-of-the-art GNN training approaches.
arXiv Detail & Related papers (2023-08-06T21:04:58Z) - Quiver: Supporting GPUs for Low-Latency, High-Throughput GNN Serving
with Workload Awareness [4.8412870364335925]
Quiver is a distributed GPU-based GNN serving system with low-latency and high- throughput.
We show that Quiver achieves up to 35 times lower latency with an 8 times higher throughput compared to state-of-the-art GNN approaches.
arXiv Detail & Related papers (2023-05-18T10:34:23Z) - Communication-Efficient Graph Neural Networks with Probabilistic
Neighborhood Expansion Analysis and Caching [59.8522166385372]
Training and inference with graph neural networks (GNNs) on massive graphs has been actively studied since the inception of GNNs.
This paper is concerned with minibatch training and inference with GNNs that employ node-wise sampling in distributed settings.
We present SALIENT++, which extends the prior state-of-the-art SALIENT system to work with partitioned feature data.
arXiv Detail & Related papers (2023-05-04T21:04:01Z) - DistGNN-MB: Distributed Large-Scale Graph Neural Network Training on x86
via Minibatch Sampling [3.518762870118332]
DistGNN-MB trains GraphSAGE 5.2x faster than the widely-used DistDGL.
At this scale, DistGNN-MB trains GraphSAGE and GAT 10x and 17.2x faster, respectively, as compute nodes scale from 2 to 32.
arXiv Detail & Related papers (2022-11-11T18:07:33Z) - GNNIE: GNN Inference Engine with Load-balancing and Graph-Specific
Caching [2.654276707313136]
GNNIE is an accelerator designed to run a broad range of Graph Neural Networks (GNNs)
It tackles workload imbalance by (i) splitting node feature operands into blocks, (ii) reordering and redistributing computations, and (iii) using a flexible MAC architecture with low communication overheads among the processing elements.
GNNIE achieves average speedups of over 8890x over a CPU and 295x over a GPU over multiple datasets on graph attention networks (GATs), graph convolutional networks (GCNs), GraphSAGE, GINConv, and DiffPool.
arXiv Detail & Related papers (2021-05-21T20:07:14Z) - VersaGNN: a Versatile accelerator for Graph neural networks [81.1667080640009]
We propose textitVersaGNN, an ultra-efficient, systolic-array-based versatile hardware accelerator.
textitVersaGNN achieves on average 3712$times$ speedup with 1301.25$times$ energy reduction on CPU, and 35.4$times$ speedup with 17.66$times$ energy reduction on GPU.
arXiv Detail & Related papers (2021-05-04T04:10:48Z) - DistGNN: Scalable Distributed Training for Large-Scale Graph Neural
Networks [58.48833325238537]
Full-batch training on Graph Neural Networks (GNN) to learn the structure of large graphs is a critical problem that needs to scale to hundreds of compute nodes to be feasible.
In this paper, we presentGNN that optimize the well-known Deep Graph Library (DGL) for full-batch training on CPU clusters.
Our results on four common GNN benchmark datasets show up to 3.7x speed-up using a single CPU socket and up to 97x speed-up using 128 CPU sockets.
arXiv Detail & Related papers (2021-04-14T08:46:35Z) - A Unified Lottery Ticket Hypothesis for Graph Neural Networks [82.31087406264437]
We present a unified GNN sparsification (UGS) framework that simultaneously prunes the graph adjacency matrix and the model weights.
We further generalize the popular lottery ticket hypothesis to GNNs for the first time, by defining a graph lottery ticket (GLT) as a pair of core sub-dataset and sparse sub-network.
arXiv Detail & Related papers (2021-02-12T21:52:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.