Related papers: GCV-Turbo: End-to-end Acceleration of GNN-based Computer Vision Tasks on FPGA

GCV-Turbo: End-to-end Acceleration of GNN-based Computer Vision Tasks on FPGA

URL: http://arxiv.org/abs/2404.07188v1
Date: Wed, 10 Apr 2024 17:41:41 GMT
Title: GCV-Turbo: End-to-end Acceleration of GNN-based Computer Vision Tasks on FPGA
Authors: Bingyi Zhang, Rajgopal Kannan, Carl Busart, Viktor Prasanna,
Abstract summary: Graph neural networks (GNNs) have recently empowered various novel computer vision (CV) tasks. This paper introduces GCV-Turbo, a domain-specific accelerator on FPGA for end-to-end acceleration of GNN-based CV tasks.
Score: 3.2507129535290926
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Graph neural networks (GNNs) have recently empowered various novel computer vision (CV) tasks. In GNN-based CV tasks, a combination of CNN layers and GNN layers or only GNN layers are employed. This paper introduces GCV-Turbo, a domain-specific accelerator on FPGA for end-to-end acceleration of GNN-based CV tasks. GCV-Turbo consists of two key components: (1) a \emph{novel} hardware architecture optimized for the computation kernels in both CNNs and GNNs using the same set of computation resources. (2) a PyTorch-compatible compiler that takes a user-defined model as input, performs end-to-end optimization for the computation graph of a given GNN-based CV task, and produces optimized code for hardware execution. The hardware architecture and the compiler work synergistically to support a variety of GNN-based CV tasks. We implement GCV-Turbo on a state-of-the-art FPGA and evaluate its performance across six representative GNN-based CV tasks with diverse input data modalities (e.g., image, human skeleton, point cloud). Compared with state-of-the-art CPU (GPU) implementations, GCV-Turbo achieves an average latency reduction of $68.4\times$ ($4.1\times$) on these six GNN-based CV tasks. Moreover, GCV-Turbo supports the execution of the standalone CNNs or GNNs, achieving performance comparable to that of state-of-the-art CNN (GNN) accelerators for widely used CNN-only (GNN-only) models.

Related papers

Accelerating Sparse Graph Neural Networks with Tensor Core Optimization [0.0]
Graphdense networks (GNNs) have seen extensive application in domains such as social networks, bioinformatics, computation and recommendation systems. Traditional computing methods are insufficient to meet the performance demands of GNNs. Recent research has explored parallel acceleration using Cores and Cores, but significant challenges persist.
arXiv Detail & Related papers (2024-12-16T01:57:53Z)
DF-GNN: Dynamic Fusion Framework for Attention Graph Neural Networks on GPUs [10.766922709869831]
We propose a dynamic kernel fusion framework, DF-GNN, for the Attention Graph Neural Networks (AT-GNNs) family. DF-GNN introduces a dynamic bi-level thread scheduling strategy, enabling flexible adjustments to thread scheduling. It surpasses existing GNN kernel optimization works like cuGraph and dgNN, with speedups up to $7.0times$ over the state-of-the-art non-fusion DGL sparse library.
arXiv Detail & Related papers (2024-11-25T06:26:58Z)
MAG-GNN: Reinforcement Learning Boosted Graph Neural Network [68.60884768323739]
A particular line of work proposed subgraph GNNs that use subgraph information to improve GNNs' expressivity and achieved great success. Such effectivity sacrifices the efficiency of GNNs by enumerating all possible subgraphs. We propose Magnetic Graph Neural Network (MAG-GNN), a reinforcement learning (RL) boosted GNN, to solve the problem.
arXiv Detail & Related papers (2023-10-29T20:32:21Z)
T-GAE: Transferable Graph Autoencoder for Network Alignment [79.89704126746204]
T-GAE is a graph autoencoder framework that leverages transferability and stability of GNNs to achieve efficient network alignment without retraining. Our experiments demonstrate that T-GAE outperforms the state-of-the-art optimization method and the best GNN approach by up to 38.7% and 50.8%, respectively.
arXiv Detail & Related papers (2023-10-05T02:58:29Z)
DGNN-Booster: A Generic FPGA Accelerator Framework For Dynamic Graph Neural Network Inference [2.2721856484014373]
We propose DGNN-Booster, which is a novel Field-Programmable Gate Array (FPGA) accelerator framework for real-time DGNN inference. We show that DGNN-Booster can achieve a speedup of up to 5.6x compared to the CPU baseline (6226R), 8.4x compared to the GPU baseline (A6000) and 2.1x compared to the FPGA baseline.
arXiv Detail & Related papers (2023-04-13T21:50:23Z)
GenGNN: A Generic FPGA Framework for Graph Neural Network Acceleration [1.460161657933122]
We propose a generic GNN acceleration framework using High-Level Synthesis (HLS), named GenGNN. We aim to deliver ultra-fast GNN inference without any graph pre-processing for real-time requirements. We verify our implementation on-board on the Xilinx Alveo U50 FPGA and observe a speed-up of up to 25x against CPU (6226R) baseline and 13x against GPU (A6000) baseline.
arXiv Detail & Related papers (2022-01-20T22:30:59Z)
TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs [21.63854538768414]
We propose TC-GNN, the first GNN framework based on GPU Core Units (TCUs) The core idea is to reconcile the "Sparse" GNN with the high-performance "Dense" TCUs. Rigorous experiments show an average of 1.70 speedup over the state-of-the-art DGL framework.
arXiv Detail & Related papers (2021-12-03T18:06:23Z)
BlockGNN: Towards Efficient GNN Acceleration Using Block-Circulant Weight Matrices [9.406007544032848]
Graph Neural Networks (GNNs) are state-of-the-art algorithms for analyzing non-euclidean graph data. How to inference GNNs in real time has become a challenging problem for some resource-limited edge-computing platforms. We propose BlockGNN, a software- hardware co-design approach to realize efficient GNN acceleration.
arXiv Detail & Related papers (2021-04-13T14:09:22Z)
A Unified Lottery Ticket Hypothesis for Graph Neural Networks [82.31087406264437]
We present a unified GNN sparsification (UGS) framework that simultaneously prunes the graph adjacency matrix and the model weights. We further generalize the popular lottery ticket hypothesis to GNNs for the first time, by defining a graph lottery ticket (GLT) as a pair of core sub-dataset and sparse sub-network.
arXiv Detail & Related papers (2021-02-12T21:52:43Z)
Identity-aware Graph Neural Networks [63.6952975763946]
We develop a class of message passing Graph Neural Networks (ID-GNNs) with greater expressive power than the 1-WL test. ID-GNN extends existing GNN architectures by inductively considering nodes' identities during message passing. We show that transforming existing GNNs to ID-GNNs yields on average 40% accuracy improvement on challenging node, edge, and graph property prediction tasks.
arXiv Detail & Related papers (2021-01-25T18:59:01Z)
GPT-GNN: Generative Pre-Training of Graph Neural Networks [93.35945182085948]
Graph neural networks (GNNs) have been demonstrated to be powerful in modeling graph-structured data. We present the GPT-GNN framework to initialize GNNs by generative pre-training. We show that GPT-GNN significantly outperforms state-of-the-art GNN models without pre-training by up to 9.1% across various downstream tasks.
arXiv Detail & Related papers (2020-06-27T20:12:33Z)
Eigen-GNN: A Graph Structure Preserving Plug-in for GNNs [95.63153473559865]
Graph Neural Networks (GNNs) are emerging machine learning models on graphs. Most existing GNN models in practice are shallow and essentially feature-centric. We show empirically and analytically that the existing shallow GNNs cannot preserve graph structures well. We propose Eigen-GNN, a plug-in module to boost GNNs ability in preserving graph structures.
arXiv Detail & Related papers (2020-06-08T02:47:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.