SmartSAGE: Training Large-scale Graph Neural Networks using In-Storage
Processing Architectures
- URL: http://arxiv.org/abs/2205.04711v1
- Date: Tue, 10 May 2022 07:25:30 GMT
- Title: SmartSAGE: Training Large-scale Graph Neural Networks using In-Storage
Processing Architectures
- Authors: Yunjae Lee, Jinha Chung, Minsoo Rhu
- Abstract summary: Graph neural networks (GNNs) can extract features by learning both the representation of each objects (i.e., graph nodes) and the relationship across different objects.
Despite its strengths, utilizing these algorithms in a production environment faces several challenges as the number of graph nodes and edges amount to several billions to hundreds of billions scale.
In this work, we first conduct a detailed characterization on a state-of-the-art, large-scale GNN training algorithm, GraphAGES.
Based on the characterization, we then explore the feasibility of utilizing capacity-optimized NVM for storing
- Score: 0.7792020418343023
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Graph neural networks (GNNs) can extract features by learning both the
representation of each objects (i.e., graph nodes) and the relationship across
different objects (i.e., the edges that connect nodes), achieving
state-of-the-art performance in various graph-based tasks. Despite its
strengths, utilizing these algorithms in a production environment faces several
challenges as the number of graph nodes and edges amount to several billions to
hundreds of billions scale, requiring substantial storage space for training.
Unfortunately, state-of-the-art ML frameworks employ an in-memory processing
model which significantly hampers the productivity of ML practitioners as it
mandates the overall working set to fit within DRAM capacity. In this work, we
first conduct a detailed characterization on a state-of-the-art, large-scale
GNN training algorithm, GraphSAGE. Based on the characterization, we then
explore the feasibility of utilizing capacity-optimized NVM SSDs for storing
memory-hungry GNN data, which enables large-scale GNN training beyond the
limits of main memory size. Given the large performance gap between DRAM and
SSD, however, blindly utilizing SSDs as a direct substitute for DRAM leads to
significant performance loss. We therefore develop SmartSAGE, our
software/hardware co-design based on an in-storage processing (ISP)
architecture. Our work demonstrates that an ISP based large-scale GNN training
system can achieve both high capacity storage and high performance, opening up
opportunities for ML practitioners to train large GNN datasets without being
hampered by the physical limitations of main memory size.
Related papers
- Reducing Memory Contention and I/O Congestion for Disk-based GNN Training [6.492879435794228]
Graph neural networks (GNNs) gain wide popularity. Large graphs with high-dimensional features become common and training GNNs on them is non-trivial.
Given a gigantic graph, even sample-based GNN training cannot work efficiently, since it is difficult to keep the graph's entire data in memory during the training process.
Memory and I/Os are hence critical for effectual disk-based training.
arXiv Detail & Related papers (2024-06-20T04:24:51Z) - CATGNN: Cost-Efficient and Scalable Distributed Training for Graph Neural Networks [7.321893519281194]
Existing distributed systems load the entire graph in memory for graph partitioning.
We propose CATGNN, a cost-efficient and scalable distributed GNN training system.
We also propose a novel streaming partitioning algorithm named SPRING for distributed GNN training.
arXiv Detail & Related papers (2024-04-02T20:55:39Z) - Topology-aware Embedding Memory for Continual Learning on Expanding Networks [63.35819388164267]
We present a framework to tackle the memory explosion problem using memory replay techniques.
PDGNNs with Topology-aware Embedding Memory (TEM) significantly outperform state-of-the-art techniques.
arXiv Detail & Related papers (2024-01-24T03:03:17Z) - Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - Communication-Efficient Graph Neural Networks with Probabilistic
Neighborhood Expansion Analysis and Caching [59.8522166385372]
Training and inference with graph neural networks (GNNs) on massive graphs has been actively studied since the inception of GNNs.
This paper is concerned with minibatch training and inference with GNNs that employ node-wise sampling in distributed settings.
We present SALIENT++, which extends the prior state-of-the-art SALIENT system to work with partitioned feature data.
arXiv Detail & Related papers (2023-05-04T21:04:01Z) - A Comprehensive Study on Large-Scale Graph Training: Benchmarking and
Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs)
We present a new ensembling training manner, named EnGCN, to address the existing issues.
Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z) - Benchmarking GNN-Based Recommender Systems on Intel Optane Persistent
Memory [9.216391057418566]
Graph neural networks (GNNs) have emerged as effective method for handling machine learning tasks on graphs.
Training GNN-based recommender systems (GNNRecSys) on large graphs incurs a large memory footprint.
We show that single-machine Optane-based GNNRecSys training outperforms distributed training by a large margin.
arXiv Detail & Related papers (2022-07-25T06:08:24Z) - Sequential Aggregation and Rematerialization: Distributed Full-batch
Training of Graph Neural Networks on Large Graphs [7.549360351036771]
We present the Sequential Aggregation and Rematerialization (SAR) scheme for distributed full-batch training of Graph Neural Networks (GNNs) on large graphs.
SAR is a distributed technique that can train any GNN type directly on an entire large graph.
We also present a general technique based on kernel fusion and attention-matrix rematerialization to optimize both the runtime and memory efficiency of attention-based models.
arXiv Detail & Related papers (2021-11-11T22:27:59Z) - PIM-DRAM:Accelerating Machine Learning Workloads using Processing in
Memory based on DRAM Technology [2.6168147530506958]
We propose a processing-in-memory (PIM) multiplication primitive to accelerate matrix vector operations in ML workloads.
We show that the proposed architecture, mapping, and data flow can provide up to 23x and 6.5x benefits over a GPU.
arXiv Detail & Related papers (2021-05-08T16:39:24Z) - SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and
Training [82.35376405568975]
Deep neural networks (DNNs) come with heavy parameterization, leading to external dynamic random-access memory (DRAM) for storage.
We present SmartDeal (SD), an algorithm framework to trade higher-cost memory storage/access for lower-cost computation.
We show that SD leads to 10.56x and 4.48x reduction in the storage and training energy, with negligible accuracy loss compared to state-of-the-art training baselines.
arXiv Detail & Related papers (2021-01-04T18:54:07Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.