Related papers: Hardware/Software Co-Programmable Framework for Computational SSDs to Accelerate Deep Learning Service on Large-Scale Graphs

Hardware/Software Co-Programmable Framework for Computational SSDs to Accelerate Deep Learning Service on Large-Scale Graphs

URL: http://arxiv.org/abs/2201.09189v1
Date: Sun, 23 Jan 2022 06:08:18 GMT
Title: Hardware/Software Co-Programmable Framework for Computational SSDs to Accelerate Deep Learning Service on Large-Scale Graphs
Authors: Miryeong Kwon, Donghyun Gouk, Sangwon Lee, Myoungsoo Jung
Abstract summary: Graph neural networks (GNNs) process large-scale graphs consisting of a hundred billion edges. We propose a novel deep learning framework on large graphs, HolisticGNN, that provides an easy-to-use, near-storage inference infrastructure for fast, energy-efficient GNN processing.
Score: 8.698995648930806
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Graph neural networks (GNNs) process large-scale graphs consisting of a hundred billion edges. In contrast to traditional deep learning, unique behaviors of the emerging GNNs are engaged with a large set of graphs and embedding data on storage, which exhibits complex and irregular preprocessing. We propose a novel deep learning framework on large graphs, HolisticGNN, that provides an easy-to-use, near-storage inference infrastructure for fast, energy-efficient GNN processing. To achieve the best end-to-end latency and high energy efficiency, HolisticGNN allows users to implement various GNN algorithms and directly executes them where the actual data exist in a holistic manner. It also enables RPC over PCIe such that the users can simply program GNNs through a graph semantic library without any knowledge of the underlying hardware or storage configurations. We fabricate HolisticGNN's hardware RTL and implement its software on an FPGA-based computational SSD (CSSD). Our empirical evaluations show that the inference time of HolisticGNN outperforms GNN inference services using high-performance modern GPUs by 7.1x while reducing energy consumption by 33.2x, on average.

Related papers

HGNAS: Hardware-Aware Graph Neural Architecture Search for Edge Devices [11.1990060370675]
This work proposes a novel hardware-aware graph neural architecture search framework tailored for resource constraint edge devices, namely HGNAS. HGNAS integrates an efficient GNN hardware performance predictor that evaluates the latency and peak memory usage of GNNs in milliseconds. It can achieve up to a 10.6x speedup and an 82.5% peak memory reduction with negligible accuracy loss compared to DGCNN on ModelNet40.
arXiv Detail & Related papers (2024-08-23T05:11:22Z)
GHOST: A Graph Neural Network Accelerator using Silicon Photonics [4.226093500082746]
Graph neural networks (GNNs) have emerged as a powerful approach for modelling and learning from graph-structured data. We present GHOST, the first silicon-photonic hardware accelerator for GNNs.
arXiv Detail & Related papers (2023-07-04T15:37:20Z)
Communication-Efficient Graph Neural Networks with Probabilistic Neighborhood Expansion Analysis and Caching [59.8522166385372]
Training and inference with graph neural networks (GNNs) on massive graphs has been actively studied since the inception of GNNs. This paper is concerned with minibatch training and inference with GNNs that employ node-wise sampling in distributed settings. We present SALIENT++, which extends the prior state-of-the-art SALIENT system to work with partitioned feature data.
arXiv Detail & Related papers (2023-05-04T21:04:01Z)
A Comprehensive Study on Large-Scale Graph Training: Benchmarking and Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs) We present a new ensembling training manner, named EnGCN, to address the existing issues. Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z)
FlowGNN: A Dataflow Architecture for Universal Graph Neural Network Inference via Multi-Queue Streaming [1.566528527065232]
Graph neural networks (GNNs) have recently exploded in popularity thanks to their broad applicability to graph-related problems. Meeting demand for novel GNN models and fast inference simultaneously is challenging because of the gap between developing efficient accelerators and the rapid creation of new GNN models. We propose a generic dataflow architecture for GNN acceleration, named FlowGNN, which can flexibly support the majority of message-passing GNNs.
arXiv Detail & Related papers (2022-04-27T17:59:25Z)
TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs [21.63854538768414]
We propose TC-GNN, the first GNN framework based on GPU Core Units (TCUs) The core idea is to reconcile the "Sparse" GNN with the high-performance "Dense" TCUs. Rigorous experiments show an average of 1.70 speedup over the state-of-the-art DGL framework.
arXiv Detail & Related papers (2021-12-03T18:06:23Z)
DistGNN: Scalable Distributed Training for Large-Scale Graph Neural Networks [58.48833325238537]
Full-batch training on Graph Neural Networks (GNN) to learn the structure of large graphs is a critical problem that needs to scale to hundreds of compute nodes to be feasible. In this paper, we presentGNN that optimize the well-known Deep Graph Library (DGL) for full-batch training on CPU clusters. Our results on four common GNN benchmark datasets show up to 3.7x speed-up using a single CPU socket and up to 97x speed-up using 128 CPU sockets.
arXiv Detail & Related papers (2021-04-14T08:46:35Z)
BlockGNN: Towards Efficient GNN Acceleration Using Block-Circulant Weight Matrices [9.406007544032848]
Graph Neural Networks (GNNs) are state-of-the-art algorithms for analyzing non-euclidean graph data. How to inference GNNs in real time has become a challenging problem for some resource-limited edge-computing platforms. We propose BlockGNN, a software- hardware co-design approach to realize efficient GNN acceleration.
arXiv Detail & Related papers (2021-04-13T14:09:22Z)
A Unified Lottery Ticket Hypothesis for Graph Neural Networks [82.31087406264437]
We present a unified GNN sparsification (UGS) framework that simultaneously prunes the graph adjacency matrix and the model weights. We further generalize the popular lottery ticket hypothesis to GNNs for the first time, by defining a graph lottery ticket (GLT) as a pair of core sub-dataset and sparse sub-network.
arXiv Detail & Related papers (2021-02-12T21:52:43Z)
Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data. In this paper, we present and evaluate different strategies for the binarization of graph neural networks. We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z)
Scaling Graph Neural Networks with Approximate PageRank [64.92311737049054]
We present the PPRGo model which utilizes an efficient approximation of information diffusion in GNNs. In addition to being faster, PPRGo is inherently scalable, and can be trivially parallelized for large datasets like those found in industry settings. We show that training PPRGo and predicting labels for all nodes in this graph takes under 2 minutes on a single machine, far outpacing other baselines on the same graph.
arXiv Detail & Related papers (2020-07-03T09:30:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.