LW-GCN: A Lightweight FPGA-based Graph Convolutional Network Accelerator
- URL: http://arxiv.org/abs/2111.03184v1
- Date: Thu, 4 Nov 2021 22:29:53 GMT
- Title: LW-GCN: A Lightweight FPGA-based Graph Convolutional Network Accelerator
- Authors: Zhuofu Tao, Chen Wu, Yuan Liang, and Lei He
- Abstract summary: Graph convolutional networks (GCNs) have been introduced to effectively process non-euclidean graph data.
LW-GCN decomposes the main GCN operations into sparse-dense matrix multiplication (SDMM) and dense matrix multiplication (DMM)
Compared to existing CPU, GPU, and state-of-the-art FPGA-based accelerator, LW-GCN reduces latency by up to 60x, 12x and 1.7x and increases power efficiency by up to 912x, 511x and 3.87x, respectively.
- Score: 14.145707219377917
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph convolutional networks (GCNs) have been introduced to effectively
process non-euclidean graph data. However, GCNs incur large amounts of
irregularity in computation and memory access, which prevents efficient use of
traditional neural network accelerators. Moreover, existing dedicated GCN
accelerators demand high memory volumes and are difficult to implement onto
resource limited edge devices. In this work, we propose LW-GCN, a lightweight
FPGA-based accelerator with a software-hardware co-designed process to tackle
irregularity in computation and memory access in GCN inference. LW-GCN
decomposes the main GCN operations into sparse-dense matrix multiplication
(SDMM) and dense matrix multiplication (DMM). We propose a novel compression
format to balance workload across PEs and prevent data hazards. Moreover, we
apply data quantization and workload tiling, and map both SDMM and DMM of GCN
inference onto a uniform architecture on resource limited hardware. Evaluation
on GCN and GraphSAGE are performed on Xilinx Kintex-7 FPGA with three popular
datasets. Compared to existing CPU, GPU, and state-of-the-art FPGA-based
accelerator, LW-GCN reduces latency by up to 60x, 12x and 1.7x and increases
power efficiency by up to 912x., 511x and 3.87x, respectively. Furthermore,
compared with NVIDIA's latest edge GPU Jetson Xavier NX, LW-GCN achieves
speedup and energy savings of 32x and 84x, respectively.
Related papers
- Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution
Networks [12.181052673940465]
Graph Convolutional Networks (GCNs) are pivotal in extracting latent information from graph data across various domains.
We present Accel-GCN, a GPU accelerator architecture for GCNs.
Evaluation of Accel-GCN across 18 benchmark graphs reveals that it outperforms cuSPARSE, GNNAdvisor, and graph-BLAST by factors of 1.17 times, 1.86 times, and 2.94 times respectively.
arXiv Detail & Related papers (2023-08-22T23:12:17Z) - SGCN: Exploiting Compressed-Sparse Features in Deep Graph Convolutional
Network Accelerators [6.582242235154822]
Graph convolutional networks (GCNs) are becoming increasingly popular as they overcome the limited applicability of prior neural networks.
In this paper, we propose SGCN, a fast and energy-efficient GCN accelerator.
We show that SGCN achieves 1.71x speedup and 43.9% higher energy efficiency compared to the existing accelerators.
arXiv Detail & Related papers (2023-01-25T02:34:01Z) - GCoD: Graph Convolutional Network Acceleration via Dedicated Algorithm
and Accelerator Co-Design [27.311994997480745]
Graph Convolutional Networks (GCNs) have emerged as the state-of-the-art graph learning model.
It can be notoriously challenging to inference GCNs over large graph datasets.
This paper proposes a GCN algorithm and accelerator Co-Design framework dubbed GCoD which can largely alleviate the aforementioned GCN irregularity.
arXiv Detail & Related papers (2021-12-22T00:30:50Z) - MG-GCN: Scalable Multi-GPU GCN Training Framework [1.7188280334580197]
Full batch training of Graph Convolutional Network (GCN) models is not feasible on a single GPU for large graphs.
MG-GCN employs multiple High-Performance Computing optimizations, including efficient re-use of memory buffers.
MG-GCN achieves super-linear speedup with respect to DGL, on the Reddit graph on both DGX-1 (V100) and DGX-A100.
arXiv Detail & Related papers (2021-10-17T00:41:43Z) - GNNIE: GNN Inference Engine with Load-balancing and Graph-Specific
Caching [2.654276707313136]
GNNIE is an accelerator designed to run a broad range of Graph Neural Networks (GNNs)
It tackles workload imbalance by (i) splitting node feature operands into blocks, (ii) reordering and redistributing computations, and (iii) using a flexible MAC architecture with low communication overheads among the processing elements.
GNNIE achieves average speedups of over 8890x over a CPU and 295x over a GPU over multiple datasets on graph attention networks (GATs), graph convolutional networks (GCNs), GraphSAGE, GINConv, and DiffPool.
arXiv Detail & Related papers (2021-05-21T20:07:14Z) - VersaGNN: a Versatile accelerator for Graph neural networks [81.1667080640009]
We propose textitVersaGNN, an ultra-efficient, systolic-array-based versatile hardware accelerator.
textitVersaGNN achieves on average 3712$times$ speedup with 1301.25$times$ energy reduction on CPU, and 35.4$times$ speedup with 17.66$times$ energy reduction on GPU.
arXiv Detail & Related papers (2021-05-04T04:10:48Z) - DistGNN: Scalable Distributed Training for Large-Scale Graph Neural
Networks [58.48833325238537]
Full-batch training on Graph Neural Networks (GNN) to learn the structure of large graphs is a critical problem that needs to scale to hundreds of compute nodes to be feasible.
In this paper, we presentGNN that optimize the well-known Deep Graph Library (DGL) for full-batch training on CPU clusters.
Our results on four common GNN benchmark datasets show up to 3.7x speed-up using a single CPU socket and up to 97x speed-up using 128 CPU sockets.
arXiv Detail & Related papers (2021-04-14T08:46:35Z) - Bi-GCN: Binary Graph Convolutional Network [57.733849700089955]
We propose a Binary Graph Convolutional Network (Bi-GCN), which binarizes both the network parameters and input node features.
Our Bi-GCN can reduce the memory consumption by an average of 30x for both the network parameters and input data, and accelerate the inference speed by an average of 47x.
arXiv Detail & Related papers (2020-10-15T07:26:23Z) - DeeperGCN: All You Need to Train Deeper GCNs [66.64739331859226]
Graph Convolutional Networks (GCNs) have been drawing significant attention with the power of representation learning on graphs.
Unlike Convolutional Neural Networks (CNNs), which are able to take advantage of stacking very deep layers, GCNs suffer from vanishing gradient, over-smoothing and over-fitting issues when going deeper.
This paper proposes DeeperGCN that is capable of successfully and reliably training very deep GCNs.
arXiv Detail & Related papers (2020-06-13T23:00:22Z) - Graph Highway Networks [77.38665506495553]
Graph Convolution Networks (GCN) are widely used in learning graph representations due to their effectiveness and efficiency.
They suffer from the notorious over-smoothing problem, in which the learned representations converge to alike vectors when many layers are stacked.
We propose Graph Highway Networks (GHNet) which utilize gating units to balance the trade-off between homogeneity and heterogeneity in the GCN learning process.
arXiv Detail & Related papers (2020-04-09T16:26:43Z) - L$^2$-GCN: Layer-Wise and Learned Efficient Training of Graph
Convolutional Networks [118.37805042816784]
Graph convolution networks (GCN) are increasingly popular in many applications, yet remain notoriously hard to train over large graph datasets.
We propose a novel efficient layer-wise training framework for GCN (L-GCN), that disentangles feature aggregation and feature transformation during training.
Experiments show that L-GCN is faster than state-of-the-arts by at least an order of magnitude, with a consistent of memory usage not dependent on dataset size.
arXiv Detail & Related papers (2020-03-30T16:37:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.