FreshGNN: Reducing Memory Access via Stable Historical Embeddings for Graph Neural Network Training
- URL: http://arxiv.org/abs/2301.07482v3
- Date: Sun, 24 Mar 2024 14:48:59 GMT
- Title: FreshGNN: Reducing Memory Access via Stable Historical Embeddings for Graph Neural Network Training
- Authors: Kezhao Huang, Haitian Jiang, Minjie Wang, Guangxuan Xiao, David Wipf, Xiang Song, Quan Gan, Zengfeng Huang, Jidong Zhai, Zheng Zhang,
- Abstract summary: Key performance bottleneck when training graph neural network (GNN) models on large, real-world graphs is loading node features onto a GPU.
We propose FreshGNN, a general-purpose GNN mini-batch training framework that leverages a historical cache for storing and reusing GNN node embeddings.
FreshGNN is able to accelerate the training speed on large graph datasets such as ogbn-papers100M and MAG240M by 3.4x up to 20.5x and reduce the memory access by 59%, with less than 1% influence on test accuracy.
- Score: 41.85974344854774
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A key performance bottleneck when training graph neural network (GNN) models on large, real-world graphs is loading node features onto a GPU. Due to limited GPU memory, expensive data movement is necessary to facilitate the storage of these features on alternative devices with slower access (e.g. CPU memory). Moreover, the irregularity of graph structures contributes to poor data locality which further exacerbates the problem. Consequently, existing frameworks capable of efficiently training large GNN models usually incur a significant accuracy degradation because of the currently-available shortcuts involved. To address these limitations, we instead propose FreshGNN, a general-purpose GNN mini-batch training framework that leverages a historical cache for storing and reusing GNN node embeddings instead of re-computing them through fetching raw features at every iteration. Critical to its success, the corresponding cache policy is designed, using a combination of gradient-based and staleness criteria, to selectively screen those embeddings which are relatively stable and can be cached, from those that need to be re-computed to reduce estimation errors and subsequent downstream accuracy loss. When paired with complementary system enhancements to support this selective historical cache, FreshGNN is able to accelerate the training speed on large graph datasets such as ogbn-papers100M and MAG240M by 3.4x up to 20.5x and reduce the memory access by 59%, with less than 1% influence on test accuracy.
Related papers
- Accelerating Storage-Based Training for Graph Neural Networks [17.9837112234959]
We propose a novel storage-based GNN training framework, named AGNES.<n>AGNES employs a method of block-wise storage I/O processing to fully utilize the I/O bandwidth of high-performance storage devices.<n>It consistently outperforms four state-of-the-art methods, by up to 4.1X faster than the best competitor.
arXiv Detail & Related papers (2026-01-04T10:37:14Z) - SpanGNN: Towards Memory-Efficient Graph Neural Networks via Spanning Subgraph Training [14.63975787929143]
Graph Neural Networks (GNNs) have superior capability in learning graph data.
Full-graph GNN training generally has high accuracy, however, it suffers from large peak memory usage.
We propose a new memory-efficient GNN training method using spanning subgraph, called SpanGNN.
arXiv Detail & Related papers (2024-06-07T13:46:23Z) - DiskGNN: Bridging I/O Efficiency and Model Accuracy for Out-of-Core GNN Training [12.945647145403438]
Graph neural networks (GNNs) are machine learning models specialized for graph data and widely used in many applications.
DiskGNN achieves high I/O efficiency and thus fast training without hurting model accuracy.
We compare DiskGNN with Ginex and MariusGNN, which are state-of-the-art systems for out-of-core GNN training.
arXiv Detail & Related papers (2024-05-08T17:27:11Z) - CATGNN: Cost-Efficient and Scalable Distributed Training for Graph Neural Networks [7.321893519281194]
Existing distributed systems load the entire graph in memory for graph partitioning.
We propose CATGNN, a cost-efficient and scalable distributed GNN training system.
We also propose a novel streaming partitioning algorithm named SPRING for distributed GNN training.
arXiv Detail & Related papers (2024-04-02T20:55:39Z) - Cached Operator Reordering: A Unified View for Fast GNN Training [24.917363701638607]
Graph Neural Networks (GNNs) are a powerful tool for handling structured graph data and addressing tasks such as node classification, graph classification, and clustering.
However, the sparse nature of GNN computation poses new challenges for performance optimization compared to traditional deep neural networks.
We address these challenges by providing a unified view of GNN computation, I/O, and memory.
arXiv Detail & Related papers (2023-08-23T12:27:55Z) - Communication-Free Distributed GNN Training with Vertex Cut [63.22674903170953]
CoFree-GNN is a novel distributed GNN training framework that significantly speeds up the training process by implementing communication-free training.
We demonstrate that CoFree-GNN speeds up the GNN training process by up to 10 times over the existing state-of-the-art GNN training approaches.
arXiv Detail & Related papers (2023-08-06T21:04:58Z) - Communication-Efficient Graph Neural Networks with Probabilistic
Neighborhood Expansion Analysis and Caching [59.8522166385372]
Training and inference with graph neural networks (GNNs) on massive graphs has been actively studied since the inception of GNNs.
This paper is concerned with minibatch training and inference with GNNs that employ node-wise sampling in distributed settings.
We present SALIENT++, which extends the prior state-of-the-art SALIENT system to work with partitioned feature data.
arXiv Detail & Related papers (2023-05-04T21:04:01Z) - A Comprehensive Study on Large-Scale Graph Training: Benchmarking and
Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs)
We present a new ensembling training manner, named EnGCN, to address the existing issues.
Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z) - Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural
Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs.
Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs.
Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z) - BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and
Preprocessing [0.0]
Graph neural networks (GNNs) have extended the success of deep neural networks (DNNs) to non-Euclidean graph data.
Existing systems are inefficient to train large graphs with billions of nodes and edges with GPUs.
This paper proposes BGL, a distributed GNN training system designed to address the bottlenecks with a few key ideas.
arXiv Detail & Related papers (2021-12-16T00:37:37Z) - GNNAutoScale: Scalable and Expressive Graph Neural Networks via
Historical Embeddings [51.82434518719011]
GNNAutoScale (GAS) is a framework for scaling arbitrary message-passing GNNs to large graphs.
Gas prunes entire sub-trees of the computation graph by utilizing historical embeddings from prior training iterations.
Gas reaches state-of-the-art performance on large-scale graphs.
arXiv Detail & Related papers (2021-06-10T09:26:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.