Hardware/Software Co-Programmable Framework for Computational SSDs to
Accelerate Deep Learning Service on Large-Scale Graphs
- URL: http://arxiv.org/abs/2201.09189v1
- Date: Sun, 23 Jan 2022 06:08:18 GMT
- Title: Hardware/Software Co-Programmable Framework for Computational SSDs to
Accelerate Deep Learning Service on Large-Scale Graphs
- Authors: Miryeong Kwon, Donghyun Gouk, Sangwon Lee, Myoungsoo Jung
- Abstract summary: Graph neural networks (GNNs) process large-scale graphs consisting of a hundred billion edges.
We propose a novel deep learning framework on large graphs, HolisticGNN, that provides an easy-to-use, near-storage inference infrastructure for fast, energy-efficient GNN processing.
- Score: 8.698995648930806
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph neural networks (GNNs) process large-scale graphs consisting of a
hundred billion edges. In contrast to traditional deep learning, unique
behaviors of the emerging GNNs are engaged with a large set of graphs and
embedding data on storage, which exhibits complex and irregular preprocessing.
We propose a novel deep learning framework on large graphs, HolisticGNN, that
provides an easy-to-use, near-storage inference infrastructure for fast,
energy-efficient GNN processing. To achieve the best end-to-end latency and
high energy efficiency, HolisticGNN allows users to implement various GNN
algorithms and directly executes them where the actual data exist in a holistic
manner. It also enables RPC over PCIe such that the users can simply program
GNNs through a graph semantic library without any knowledge of the underlying
hardware or storage configurations.
We fabricate HolisticGNN's hardware RTL and implement its software on an
FPGA-based computational SSD (CSSD). Our empirical evaluations show that the
inference time of HolisticGNN outperforms GNN inference services using
high-performance modern GPUs by 7.1x while reducing energy consumption by
33.2x, on average.
Related papers
- HGNAS: Hardware-Aware Graph Neural Architecture Search for Edge Devices [11.1990060370675]
This work proposes a novel hardware-aware graph neural architecture search framework tailored for resource constraint edge devices, namely HGNAS.
HGNAS integrates an efficient GNN hardware performance predictor that evaluates the latency and peak memory usage of GNNs in milliseconds.
It can achieve up to a 10.6x speedup and an 82.5% peak memory reduction with negligible accuracy loss compared to DGCNN on ModelNet40.
arXiv Detail & Related papers (2024-08-23T05:11:22Z) - GHOST: A Graph Neural Network Accelerator using Silicon Photonics [4.226093500082746]
Graph neural networks (GNNs) have emerged as a powerful approach for modelling and learning from graph-structured data.
We present GHOST, the first silicon-photonic hardware accelerator for GNNs.
arXiv Detail & Related papers (2023-07-04T15:37:20Z) - Communication-Efficient Graph Neural Networks with Probabilistic
Neighborhood Expansion Analysis and Caching [59.8522166385372]
Training and inference with graph neural networks (GNNs) on massive graphs has been actively studied since the inception of GNNs.
This paper is concerned with minibatch training and inference with GNNs that employ node-wise sampling in distributed settings.
We present SALIENT++, which extends the prior state-of-the-art SALIENT system to work with partitioned feature data.
arXiv Detail & Related papers (2023-05-04T21:04:01Z) - A Comprehensive Study on Large-Scale Graph Training: Benchmarking and
Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs)
We present a new ensembling training manner, named EnGCN, to address the existing issues.
Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z) - FlowGNN: A Dataflow Architecture for Universal Graph Neural Network
Inference via Multi-Queue Streaming [1.566528527065232]
Graph neural networks (GNNs) have recently exploded in popularity thanks to their broad applicability to graph-related problems.
Meeting demand for novel GNN models and fast inference simultaneously is challenging because of the gap between developing efficient accelerators and the rapid creation of new GNN models.
We propose a generic dataflow architecture for GNN acceleration, named FlowGNN, which can flexibly support the majority of message-passing GNNs.
arXiv Detail & Related papers (2022-04-27T17:59:25Z) - TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs [21.63854538768414]
We propose TC-GNN, the first GNN framework based on GPU Core Units (TCUs)
The core idea is to reconcile the "Sparse" GNN with the high-performance "Dense" TCUs.
Rigorous experiments show an average of 1.70 speedup over the state-of-the-art DGL framework.
arXiv Detail & Related papers (2021-12-03T18:06:23Z) - DistGNN: Scalable Distributed Training for Large-Scale Graph Neural
Networks [58.48833325238537]
Full-batch training on Graph Neural Networks (GNN) to learn the structure of large graphs is a critical problem that needs to scale to hundreds of compute nodes to be feasible.
In this paper, we presentGNN that optimize the well-known Deep Graph Library (DGL) for full-batch training on CPU clusters.
Our results on four common GNN benchmark datasets show up to 3.7x speed-up using a single CPU socket and up to 97x speed-up using 128 CPU sockets.
arXiv Detail & Related papers (2021-04-14T08:46:35Z) - BlockGNN: Towards Efficient GNN Acceleration Using Block-Circulant
Weight Matrices [9.406007544032848]
Graph Neural Networks (GNNs) are state-of-the-art algorithms for analyzing non-euclidean graph data.
How to inference GNNs in real time has become a challenging problem for some resource-limited edge-computing platforms.
We propose BlockGNN, a software- hardware co-design approach to realize efficient GNN acceleration.
arXiv Detail & Related papers (2021-04-13T14:09:22Z) - A Unified Lottery Ticket Hypothesis for Graph Neural Networks [82.31087406264437]
We present a unified GNN sparsification (UGS) framework that simultaneously prunes the graph adjacency matrix and the model weights.
We further generalize the popular lottery ticket hypothesis to GNNs for the first time, by defining a graph lottery ticket (GLT) as a pair of core sub-dataset and sparse sub-network.
arXiv Detail & Related papers (2021-02-12T21:52:43Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z) - Scaling Graph Neural Networks with Approximate PageRank [64.92311737049054]
We present the PPRGo model which utilizes an efficient approximation of information diffusion in GNNs.
In addition to being faster, PPRGo is inherently scalable, and can be trivially parallelized for large datasets like those found in industry settings.
We show that training PPRGo and predicting labels for all nodes in this graph takes under 2 minutes on a single machine, far outpacing other baselines on the same graph.
arXiv Detail & Related papers (2020-07-03T09:30:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.