BlockGNN: Towards Efficient GNN Acceleration Using Block-Circulant
Weight Matrices
- URL: http://arxiv.org/abs/2104.06214v1
- Date: Tue, 13 Apr 2021 14:09:22 GMT
- Title: BlockGNN: Towards Efficient GNN Acceleration Using Block-Circulant
Weight Matrices
- Authors: Zhe Zhou, Bizhao Shi, Zhe Zhang, Yijin Guan, Guangyu Sun, Guojie Luo
- Abstract summary: Graph Neural Networks (GNNs) are state-of-the-art algorithms for analyzing non-euclidean graph data.
How to inference GNNs in real time has become a challenging problem for some resource-limited edge-computing platforms.
We propose BlockGNN, a software- hardware co-design approach to realize efficient GNN acceleration.
- Score: 9.406007544032848
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, Graph Neural Networks (GNNs) appear to be state-of-the-art
algorithms for analyzing non-euclidean graph data. By applying deep-learning to
extract high-level representations from graph structures, GNNs achieve
extraordinary accuracy and great generalization ability in various tasks.
However, with the ever-increasing graph sizes, more and more complicated GNN
layers, and higher feature dimensions, the computational complexity of GNNs
grows exponentially. How to inference GNNs in real time has become a
challenging problem, especially for some resource-limited edge-computing
platforms.
To tackle this challenge, we propose BlockGNN, a software-hardware co-design
approach to realize efficient GNN acceleration. At the algorithm level, we
propose to leverage block-circulant weight matrices to greatly reduce the
complexity of various GNN models. At the hardware design level, we propose a
pipelined CirCore architecture, which supports efficient block-circulant
matrices computation. Basing on CirCore, we present a novel BlockGNN
accelerator to compute various GNNs with low latency. Moreover, to determine
the optimal configurations for diverse deployed tasks, we also introduce a
performance and resource model that helps choose the optimal hardware
parameters automatically. Comprehensive experiments on the ZC706 FPGA platform
demonstrate that on various GNN tasks, BlockGNN achieves up to $8.3\times$
speedup compared to the baseline HyGCN architecture and $111.9\times$ energy
reduction compared to the Intel Xeon CPU platform.
Related papers
- Spatio-Spectral Graph Neural Networks [50.277959544420455]
We propose Spatio-Spectral Graph Networks (S$2$GNNs)
S$2$GNNs combine spatially and spectrally parametrized graph filters.
We show that S$2$GNNs vanquish over-squashing and yield strictly tighter approximation-theoretic error bounds than MPGNNs.
arXiv Detail & Related papers (2024-05-29T14:28:08Z) - MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training [7.193336207798203]
We present MaxK-GNN, an advanced high-performance GPU training system integrating algorithm and system innovation.
Experiments show that MaxK-GNN system could approach the theoretical speedup limit according to Amdahl's law.
We achieve comparable accuracy to SOTA GNNs, but at a significantly increased speed: 3.22/4.24 times speedup (vs. theoretical limits, 5.52/7.27 times) on Reddit.
arXiv Detail & Related papers (2023-12-14T05:00:49Z) - T-GAE: Transferable Graph Autoencoder for Network Alignment [79.89704126746204]
T-GAE is a graph autoencoder framework that leverages transferability and stability of GNNs to achieve efficient network alignment without retraining.
Our experiments demonstrate that T-GAE outperforms the state-of-the-art optimization method and the best GNN approach by up to 38.7% and 50.8%, respectively.
arXiv Detail & Related papers (2023-10-05T02:58:29Z) - Cached Operator Reordering: A Unified View for Fast GNN Training [24.917363701638607]
Graph Neural Networks (GNNs) are a powerful tool for handling structured graph data and addressing tasks such as node classification, graph classification, and clustering.
However, the sparse nature of GNN computation poses new challenges for performance optimization compared to traditional deep neural networks.
We address these challenges by providing a unified view of GNN computation, I/O, and memory.
arXiv Detail & Related papers (2023-08-23T12:27:55Z) - Hardware-Aware Graph Neural Network Automated Design for Edge Computing
Platforms [9.345807588929734]
HGNAS is proposed as the first Hardware-aware Graph Neural Architecture Search framework targeting resource constraint edge devices.
Results show that HGNAS can achieve about $10.6times$ speedup and $88.2%$ peak memory reduction with a negligible accuracy loss compared to DGCNN on various edge devices.
arXiv Detail & Related papers (2023-03-20T05:18:31Z) - Distributed Graph Neural Network Training: A Survey [51.77035975191926]
Graph neural networks (GNNs) are a type of deep learning models that are trained on graphs and have been successfully applied in various domains.
Despite the effectiveness of GNNs, it is still challenging for GNNs to efficiently scale to large graphs.
As a remedy, distributed computing becomes a promising solution of training large-scale GNNs.
arXiv Detail & Related papers (2022-11-01T01:57:00Z) - A Comprehensive Study on Large-Scale Graph Training: Benchmarking and
Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs)
We present a new ensembling training manner, named EnGCN, to address the existing issues.
Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z) - Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency
Analysis [28.464210819376593]
Graph neural networks (GNNs) are among the most powerful tools in deep learning.
They routinely solve complex problems on unstructured networks, such as node classification, graph classification, or link prediction, with high accuracy.
However, both inference and training of GNNs are complex, and they uniquely combine the features of irregular graph processing with dense and regular computations.
This complexity makes it very challenging to execute GNNs efficiently on modern massively parallel architectures.
arXiv Detail & Related papers (2022-05-19T17:11:45Z) - TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs [21.63854538768414]
We propose TC-GNN, the first GNN framework based on GPU Core Units (TCUs)
The core idea is to reconcile the "Sparse" GNN with the high-performance "Dense" TCUs.
Rigorous experiments show an average of 1.70 speedup over the state-of-the-art DGL framework.
arXiv Detail & Related papers (2021-12-03T18:06:23Z) - Training Graph Neural Networks with 1000 Layers [133.84813995275988]
We study reversible connections, group convolutions, weight tying, and equilibrium models to advance the memory and parameter efficiency of GNNs.
To the best of our knowledge, RevGNN-Deep is the deepest GNN in the literature by one order of magnitude.
arXiv Detail & Related papers (2021-06-14T15:03:00Z) - Eigen-GNN: A Graph Structure Preserving Plug-in for GNNs [95.63153473559865]
Graph Neural Networks (GNNs) are emerging machine learning models on graphs.
Most existing GNN models in practice are shallow and essentially feature-centric.
We show empirically and analytically that the existing shallow GNNs cannot preserve graph structures well.
We propose Eigen-GNN, a plug-in module to boost GNNs ability in preserving graph structures.
arXiv Detail & Related papers (2020-06-08T02:47:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.