Ginex: SSD-enabled Billion-scale Graph Neural Network Training on a
Single Machine via Provably Optimal In-memory Caching
- URL: http://arxiv.org/abs/2208.09151v1
- Date: Fri, 19 Aug 2022 04:57:18 GMT
- Title: Ginex: SSD-enabled Billion-scale Graph Neural Network Training on a
Single Machine via Provably Optimal In-memory Caching
- Authors: Yeonhong Park, Sunhong Min, Jae W. Lee
- Abstract summary: Graph Neural Networks (GNNs) have been receiving a spotlight as a powerful tool that can effectively serve various graph tasks on structured data.
As the size of real-world graphs continues to scale, the GNN training system faces a scalability challenge.
We propose Ginex, the first SSD-based GNN training system that can process billion-scale graph datasets on a single machine.
- Score: 3.0479527348064197
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, Graph Neural Networks (GNNs) have been receiving a spotlight as a
powerful tool that can effectively serve various inference tasks on graph
structured data. As the size of real-world graphs continues to scale, the GNN
training system faces a scalability challenge. Distributed training is a
popular approach to address this challenge by scaling out CPU nodes. However,
not much attention has been paid to disk-based GNN training, which can scale up
the single-node system in a more cost-effective manner by leveraging
high-performance storage devices like NVMe SSDs. We observe that the data
movement between the main memory and the disk is the primary bottleneck in the
SSD-based training system, and that the conventional GNN training pipeline is
sub-optimal without taking this overhead into account. Thus, we propose Ginex,
the first SSD-based GNN training system that can process billion-scale graph
datasets on a single machine. Inspired by the inspector-executor execution
model in compiler optimization, Ginex restructures the GNN training pipeline by
separating sample and gather stages. This separation enables Ginex to realize a
provably optimal replacement algorithm, known as Belady's algorithm, for
caching feature vectors in memory, which account for the dominant portion of
I/O accesses. According to our evaluation with four billion-scale graph
datasets, Ginex achieves 2.11x higher training throughput on average (up to
2.67x at maximum) than the SSD-extended PyTorch Geometric.
Related papers
- Reducing Memory Contention and I/O Congestion for Disk-based GNN Training [6.492879435794228]
Graph neural networks (GNNs) gain wide popularity. Large graphs with high-dimensional features become common and training GNNs on them is non-trivial.
Given a gigantic graph, even sample-based GNN training cannot work efficiently, since it is difficult to keep the graph's entire data in memory during the training process.
Memory and I/Os are hence critical for effectual disk-based training.
arXiv Detail & Related papers (2024-06-20T04:24:51Z) - DiskGNN: Bridging I/O Efficiency and Model Accuracy for Out-of-Core GNN Training [12.945647145403438]
Graph neural networks (GNNs) are machine learning models specialized for graph data and widely used in many applications.
DiskGNN achieves high I/O efficiency and thus fast training without hurting model accuracy.
We compare DiskGNN with Ginex and MariusGNN, which are state-of-the-art systems for out-of-core GNN training.
arXiv Detail & Related papers (2024-05-08T17:27:11Z) - CATGNN: Cost-Efficient and Scalable Distributed Training for Graph Neural Networks [7.321893519281194]
Existing distributed systems load the entire graph in memory for graph partitioning.
We propose CATGNN, a cost-efficient and scalable distributed GNN training system.
We also propose a novel streaming partitioning algorithm named SPRING for distributed GNN training.
arXiv Detail & Related papers (2024-04-02T20:55:39Z) - Communication-Free Distributed GNN Training with Vertex Cut [63.22674903170953]
CoFree-GNN is a novel distributed GNN training framework that significantly speeds up the training process by implementing communication-free training.
We demonstrate that CoFree-GNN speeds up the GNN training process by up to 10 times over the existing state-of-the-art GNN training approaches.
arXiv Detail & Related papers (2023-08-06T21:04:58Z) - Communication-Efficient Graph Neural Networks with Probabilistic
Neighborhood Expansion Analysis and Caching [59.8522166385372]
Training and inference with graph neural networks (GNNs) on massive graphs has been actively studied since the inception of GNNs.
This paper is concerned with minibatch training and inference with GNNs that employ node-wise sampling in distributed settings.
We present SALIENT++, which extends the prior state-of-the-art SALIENT system to work with partitioned feature data.
arXiv Detail & Related papers (2023-05-04T21:04:01Z) - Scalable Graph Convolutional Network Training on Distributed-Memory
Systems [5.169989177779801]
Graph Convolutional Networks (GCNs) are extensively utilized for deep learning on graphs.
Since the convolution operation on graphs induces irregular memory access patterns, designing a memory- and communication-efficient parallel algorithm for GCN training poses unique challenges.
We propose a highly parallel training algorithm that scales to large processor counts.
arXiv Detail & Related papers (2022-12-09T17:51:13Z) - A Comprehensive Study on Large-Scale Graph Training: Benchmarking and
Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs)
We present a new ensembling training manner, named EnGCN, to address the existing issues.
Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z) - BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and
Preprocessing [0.0]
Graph neural networks (GNNs) have extended the success of deep neural networks (DNNs) to non-Euclidean graph data.
Existing systems are inefficient to train large graphs with billions of nodes and edges with GPUs.
This paper proposes BGL, a distributed GNN training system designed to address the bottlenecks with a few key ideas.
arXiv Detail & Related papers (2021-12-16T00:37:37Z) - DistGNN: Scalable Distributed Training for Large-Scale Graph Neural
Networks [58.48833325238537]
Full-batch training on Graph Neural Networks (GNN) to learn the structure of large graphs is a critical problem that needs to scale to hundreds of compute nodes to be feasible.
In this paper, we presentGNN that optimize the well-known Deep Graph Library (DGL) for full-batch training on CPU clusters.
Our results on four common GNN benchmark datasets show up to 3.7x speed-up using a single CPU socket and up to 97x speed-up using 128 CPU sockets.
arXiv Detail & Related papers (2021-04-14T08:46:35Z) - A Unified Lottery Ticket Hypothesis for Graph Neural Networks [82.31087406264437]
We present a unified GNN sparsification (UGS) framework that simultaneously prunes the graph adjacency matrix and the model weights.
We further generalize the popular lottery ticket hypothesis to GNNs for the first time, by defining a graph lottery ticket (GLT) as a pair of core sub-dataset and sparse sub-network.
arXiv Detail & Related papers (2021-02-12T21:52:43Z) - Scaling Graph Neural Networks with Approximate PageRank [64.92311737049054]
We present the PPRGo model which utilizes an efficient approximation of information diffusion in GNNs.
In addition to being faster, PPRGo is inherently scalable, and can be trivially parallelized for large datasets like those found in industry settings.
We show that training PPRGo and predicting labels for all nodes in this graph takes under 2 minutes on a single machine, far outpacing other baselines on the same graph.
arXiv Detail & Related papers (2020-07-03T09:30:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.