Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution
Networks
- URL: http://arxiv.org/abs/2308.11825v1
- Date: Tue, 22 Aug 2023 23:12:17 GMT
- Title: Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution
Networks
- Authors: Xi Xie, Hongwu Peng, Amit Hasan, Shaoyi Huang, Jiahui Zhao, Haowen
Fang, Wei Zhang, Tong Geng, Omer Khan, and Caiwen Ding
- Abstract summary: Graph Convolutional Networks (GCNs) are pivotal in extracting latent information from graph data across various domains.
We present Accel-GCN, a GPU accelerator architecture for GCNs.
Evaluation of Accel-GCN across 18 benchmark graphs reveals that it outperforms cuSPARSE, GNNAdvisor, and graph-BLAST by factors of 1.17 times, 1.86 times, and 2.94 times respectively.
- Score: 12.181052673940465
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph Convolutional Networks (GCNs) are pivotal in extracting latent
information from graph data across various domains, yet their acceleration on
mainstream GPUs is challenged by workload imbalance and memory access
irregularity. To address these challenges, we present Accel-GCN, a GPU
accelerator architecture for GCNs. The design of Accel-GCN encompasses: (i) a
lightweight degree sorting stage to group nodes with similar degree; (ii) a
block-level partition strategy that dynamically adjusts warp workload sizes,
enhancing shared memory locality and workload balance, and reducing metadata
overhead compared to designs like GNNAdvisor; (iii) a combined warp strategy
that improves memory coalescing and computational parallelism in the column
dimension of dense matrices.
Utilizing these principles, we formulated a kernel for sparse matrix
multiplication (SpMM) in GCNs that employs block-level partitioning and
combined warp strategy. This approach augments performance and multi-level
memory efficiency and optimizes memory bandwidth by exploiting memory
coalescing and alignment. Evaluation of Accel-GCN across 18 benchmark graphs
reveals that it outperforms cuSPARSE, GNNAdvisor, and graph-BLAST by factors of
1.17 times, 1.86 times, and 2.94 times respectively. The results underscore
Accel-GCN as an effective solution for enhancing GCN computational efficiency.
Related papers
- Efficient Message Passing Architecture for GCN Training on HBM-based FPGAs with Orthogonal Topology On-Chip Networks [0.0]
Graph Convolutional Networks (GCNs) are state-of-the-art deep learning models for representation learning on graphs.
We propose a message-passing architecture that leverages NUMA-based memory access properties.
We also re-engineered the backpropagation algorithm specific to GCNs within our proposed accelerator.
arXiv Detail & Related papers (2024-11-06T12:00:51Z) - Rethinking and Accelerating Graph Condensation: A Training-Free Approach with Class Partition [56.26113670151363]
Graph condensation is a data-centric solution to replace the large graph with a small yet informative condensed graph.
Existing GC methods suffer from intricate optimization processes, necessitating excessive computing resources.
We propose a training-free GC framework termed Class-partitioned Graph Condensation (CGC)
CGC achieves state-of-the-art performance with a more efficient condensation process.
arXiv Detail & Related papers (2024-05-22T14:57:09Z) - Cached Operator Reordering: A Unified View for Fast GNN Training [24.917363701638607]
Graph Neural Networks (GNNs) are a powerful tool for handling structured graph data and addressing tasks such as node classification, graph classification, and clustering.
However, the sparse nature of GNN computation poses new challenges for performance optimization compared to traditional deep neural networks.
We address these challenges by providing a unified view of GNN computation, I/O, and memory.
arXiv Detail & Related papers (2023-08-23T12:27:55Z) - SPA-GCN: Efficient and Flexible GCN Accelerator with an Application for
Graph Similarity Computation [7.54579279348595]
We propose a flexible architecture called SPA-GCN for accelerating Graph Convolutional Networks (GCN) on graphs.
We show that SPA-GCN can deliver a high speedup compared to a multi-core CPU implementation and a GPU implementation.
arXiv Detail & Related papers (2021-11-10T20:47:57Z) - GNNIE: GNN Inference Engine with Load-balancing and Graph-Specific
Caching [2.654276707313136]
GNNIE is an accelerator designed to run a broad range of Graph Neural Networks (GNNs)
It tackles workload imbalance by (i) splitting node feature operands into blocks, (ii) reordering and redistributing computations, and (iii) using a flexible MAC architecture with low communication overheads among the processing elements.
GNNIE achieves average speedups of over 8890x over a CPU and 295x over a GPU over multiple datasets on graph attention networks (GATs), graph convolutional networks (GCNs), GraphSAGE, GINConv, and DiffPool.
arXiv Detail & Related papers (2021-05-21T20:07:14Z) - DistGNN: Scalable Distributed Training for Large-Scale Graph Neural
Networks [58.48833325238537]
Full-batch training on Graph Neural Networks (GNN) to learn the structure of large graphs is a critical problem that needs to scale to hundreds of compute nodes to be feasible.
In this paper, we presentGNN that optimize the well-known Deep Graph Library (DGL) for full-batch training on CPU clusters.
Our results on four common GNN benchmark datasets show up to 3.7x speed-up using a single CPU socket and up to 97x speed-up using 128 CPU sockets.
arXiv Detail & Related papers (2021-04-14T08:46:35Z) - Towards Efficient Graph Convolutional Networks for Point Cloud Handling [181.59146413326056]
We aim at improving the computational efficiency of graph convolutional networks (GCNs) for learning on point clouds.
A series of experiments show that optimized networks have reduced computational complexity, decreased memory consumption, and accelerated inference speed.
arXiv Detail & Related papers (2021-04-12T17:59:16Z) - Bi-GCN: Binary Graph Convolutional Network [57.733849700089955]
We propose a Binary Graph Convolutional Network (Bi-GCN), which binarizes both the network parameters and input node features.
Our Bi-GCN can reduce the memory consumption by an average of 30x for both the network parameters and input data, and accelerate the inference speed by an average of 47x.
arXiv Detail & Related papers (2020-10-15T07:26:23Z) - Graph Highway Networks [77.38665506495553]
Graph Convolution Networks (GCN) are widely used in learning graph representations due to their effectiveness and efficiency.
They suffer from the notorious over-smoothing problem, in which the learned representations converge to alike vectors when many layers are stacked.
We propose Graph Highway Networks (GHNet) which utilize gating units to balance the trade-off between homogeneity and heterogeneity in the GCN learning process.
arXiv Detail & Related papers (2020-04-09T16:26:43Z) - L$^2$-GCN: Layer-Wise and Learned Efficient Training of Graph
Convolutional Networks [118.37805042816784]
Graph convolution networks (GCN) are increasingly popular in many applications, yet remain notoriously hard to train over large graph datasets.
We propose a novel efficient layer-wise training framework for GCN (L-GCN), that disentangles feature aggregation and feature transformation during training.
Experiments show that L-GCN is faster than state-of-the-arts by at least an order of magnitude, with a consistent of memory usage not dependent on dataset size.
arXiv Detail & Related papers (2020-03-30T16:37:56Z) - GraphACT: Accelerating GCN Training on CPU-FPGA Heterogeneous Platforms [1.2183405753834562]
Graph Convolutional Networks (GCNs) have emerged as the state-of-the-art deep learning model for representation learning on graphs.
It is challenging to accelerate training of GCNs due to substantial and irregular data communication.
We design a novel accelerator for training GCNs on CPU-FPGA heterogeneous systems.
arXiv Detail & Related papers (2019-12-31T21:19:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.