Related papers: GNNHLS: Evaluating Graph Neural Network Inference via High-Level Synthesis

GNNHLS: Evaluating Graph Neural Network Inference via High-Level Synthesis

URL: http://arxiv.org/abs/2309.16022v1
Date: Wed, 27 Sep 2023 20:58:33 GMT
Title: GNNHLS: Evaluating Graph Neural Network Inference via High-Level Synthesis
Authors: Chenfeng Zhao, Zehao Dong, Yixin Chen, Xuan Zhang, Roger D. Chamberlain
Abstract summary: We propose GNNHLS, an open-source framework to comprehensively evaluate GNN inference acceleration on FPGAs via HLS. We evaluate GNNHLS on 4 graph datasets with distinct topologies and scales. GNNHLS achieves up to 50.8x speedup and 423x energy reduction relative to the CPU baselines.
Score: 8.036399595635034
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the ever-growing popularity of Graph Neural Networks (GNNs), efficient GNN inference is gaining tremendous attention. Field-Programming Gate Arrays (FPGAs) are a promising execution platform due to their fine-grained parallelism, low-power consumption, reconfigurability, and concurrent execution. Even better, High-Level Synthesis (HLS) tools bridge the gap between the non-trivial FPGA development efforts and rapid emergence of new GNN models. In this paper, we propose GNNHLS, an open-source framework to comprehensively evaluate GNN inference acceleration on FPGAs via HLS, containing a software stack for data generation and baseline deployment, and FPGA implementations of 6 well-tuned GNN HLS kernels. We evaluate GNNHLS on 4 graph datasets with distinct topologies and scales. The results show that GNNHLS achieves up to 50.8x speedup and 423x energy reduction relative to the CPU baselines. Compared with the GPU baselines, GNNHLS achieves up to 5.16x speedup and 74.5x energy reduction.

Related papers

Spatio-Spectral Graph Neural Networks [50.277959544420455]
We propose Spatio-Spectral Graph Networks (S$2$GNNs) S$2$GNNs combine spatially and spectrally parametrized graph filters. We show that S$2$GNNs vanquish over-squashing and yield strictly tighter approximation-theoretic error bounds than MPGNNs.
arXiv Detail & Related papers (2024-05-29T14:28:08Z)
T-GAE: Transferable Graph Autoencoder for Network Alignment [79.89704126746204]
T-GAE is a graph autoencoder framework that leverages transferability and stability of GNNs to achieve efficient network alignment without retraining. Our experiments demonstrate that T-GAE outperforms the state-of-the-art optimization method and the best GNN approach by up to 38.7% and 50.8%, respectively.
arXiv Detail & Related papers (2023-10-05T02:58:29Z)
Low Latency Edge Classification GNN for Particle Trajectory Tracking on FPGAs [10.146819379097249]
This paper introduces a resource-efficient GNN architecture on FPGAs for low latency particle tracking. Our results on Xilinx UltraScale+ VU9P demonstrate 1625x and 1574x performance improvement over CPU and GPU respectively.
arXiv Detail & Related papers (2023-06-20T06:57:24Z)
Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication [100.51884192970499]
GNNs are a powerful family of neural networks for learning over graphs. scaling GNNs either by deepening or widening suffers from prevalent issues of unhealthy gradients, over-smoothening, information squashing. We propose not to deepen or widen current GNNs, but instead present a data-centric perspective of model soups tailored for GNNs.
arXiv Detail & Related papers (2023-06-18T03:33:46Z)
DGNN-Booster: A Generic FPGA Accelerator Framework For Dynamic Graph Neural Network Inference [2.2721856484014373]
We propose DGNN-Booster, which is a novel Field-Programmable Gate Array (FPGA) accelerator framework for real-time DGNN inference. We show that DGNN-Booster can achieve a speedup of up to 5.6x compared to the CPU baseline (6226R), 8.4x compared to the GPU baseline (A6000) and 2.1x compared to the FPGA baseline.
arXiv Detail & Related papers (2023-04-13T21:50:23Z)
A Comprehensive Study on Large-Scale Graph Training: Benchmarking and Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs) We present a new ensembling training manner, named EnGCN, to address the existing issues. Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z)
Hardware/Software Co-Programmable Framework for Computational SSDs to Accelerate Deep Learning Service on Large-Scale Graphs [8.698995648930806]
Graph neural networks (GNNs) process large-scale graphs consisting of a hundred billion edges. We propose a novel deep learning framework on large graphs, HolisticGNN, that provides an easy-to-use, near-storage inference infrastructure for fast, energy-efficient GNN processing.
arXiv Detail & Related papers (2022-01-23T06:08:18Z)
GenGNN: A Generic FPGA Framework for Graph Neural Network Acceleration [1.460161657933122]
We propose a generic GNN acceleration framework using High-Level Synthesis (HLS), named GenGNN. We aim to deliver ultra-fast GNN inference without any graph pre-processing for real-time requirements. We verify our implementation on-board on the Xilinx Alveo U50 FPGA and observe a speed-up of up to 25x against CPU (6226R) baseline and 13x against GPU (A6000) baseline.
arXiv Detail & Related papers (2022-01-20T22:30:59Z)
DistGNN: Scalable Distributed Training for Large-Scale Graph Neural Networks [58.48833325238537]
Full-batch training on Graph Neural Networks (GNN) to learn the structure of large graphs is a critical problem that needs to scale to hundreds of compute nodes to be feasible. In this paper, we presentGNN that optimize the well-known Deep Graph Library (DGL) for full-batch training on CPU clusters. Our results on four common GNN benchmark datasets show up to 3.7x speed-up using a single CPU socket and up to 97x speed-up using 128 CPU sockets.
arXiv Detail & Related papers (2021-04-14T08:46:35Z)
BlockGNN: Towards Efficient GNN Acceleration Using Block-Circulant Weight Matrices [9.406007544032848]
Graph Neural Networks (GNNs) are state-of-the-art algorithms for analyzing non-euclidean graph data. How to inference GNNs in real time has become a challenging problem for some resource-limited edge-computing platforms. We propose BlockGNN, a software- hardware co-design approach to realize efficient GNN acceleration.
arXiv Detail & Related papers (2021-04-13T14:09:22Z)
A Unified Lottery Ticket Hypothesis for Graph Neural Networks [82.31087406264437]
We present a unified GNN sparsification (UGS) framework that simultaneously prunes the graph adjacency matrix and the model weights. We further generalize the popular lottery ticket hypothesis to GNNs for the first time, by defining a graph lottery ticket (GLT) as a pair of core sub-dataset and sparse sub-network.
arXiv Detail & Related papers (2021-02-12T21:52:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.