Related papers: DF-GNN: Dynamic Fusion Framework for Attention Graph Neural Networks on GPUs

DF-GNN: Dynamic Fusion Framework for Attention Graph Neural Networks on GPUs

URL: http://arxiv.org/abs/2411.16127v1
Date: Mon, 25 Nov 2024 06:26:58 GMT
Title: DF-GNN: Dynamic Fusion Framework for Attention Graph Neural Networks on GPUs
Authors: Jiahui Liu, Zhenkun Cai, Zhiyong Chen, Minjie Wang,
Abstract summary: We propose a dynamic kernel fusion framework, DF-GNN, for the Attention Graph Neural Networks (AT-GNNs) family. DF-GNN introduces a dynamic bi-level thread scheduling strategy, enabling flexible adjustments to thread scheduling. It surpasses existing GNN kernel optimization works like cuGraph and dgNN, with speedups up to $7.0times$ over the state-of-the-art non-fusion DGL sparse library.
Score: 10.766922709869831
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Attention Graph Neural Networks (AT-GNNs), such as GAT and Graph Transformer, have demonstrated superior performance compared to other GNNs. However, existing GNN systems struggle to efficiently train AT-GNNs on GPUs due to their intricate computation patterns. The execution of AT-GNN operations without kernel fusion results in heavy data movement and significant kernel launch overhead, while fixed thread scheduling in existing GNN kernel fusion strategies leads to sub-optimal performance, redundant computation and unbalanced workload. To address these challenges, we propose a dynamic kernel fusion framework, DF-GNN, for the AT-GNN family. DF-GNN introduces a dynamic bi-level thread scheduling strategy, enabling flexible adjustments to thread scheduling while retaining the benefits of shared memory within the fused kernel. DF-GNN tailors specific thread scheduling for operations in AT-GNNs and considers the performance bottleneck shift caused by the presence of super nodes. Additionally, DF-GNN is integrated with the PyTorch framework for high programmability. Evaluations across diverse GNN models and multiple datasets reveal that DF-GNN surpasses existing GNN kernel optimization works like cuGraph and dgNN, with speedups up to $7.0\times$ over the state-of-the-art non-fusion DGL sparse library. Moreover, it achieves an average speedup of $2.16\times$ in end-to-end training compared to the popular GNN computing framework DGL.

Related papers

T-GAE: Transferable Graph Autoencoder for Network Alignment [79.89704126746204]
T-GAE is a graph autoencoder framework that leverages transferability and stability of GNNs to achieve efficient network alignment without retraining. Our experiments demonstrate that T-GAE outperforms the state-of-the-art optimization method and the best GNN approach by up to 38.7% and 50.8%, respectively.
arXiv Detail & Related papers (2023-10-05T02:58:29Z)
Cached Operator Reordering: A Unified View for Fast GNN Training [24.917363701638607]
Graph Neural Networks (GNNs) are a powerful tool for handling structured graph data and addressing tasks such as node classification, graph classification, and clustering. However, the sparse nature of GNN computation poses new challenges for performance optimization compared to traditional deep neural networks. We address these challenges by providing a unified view of GNN computation, I/O, and memory.
arXiv Detail & Related papers (2023-08-23T12:27:55Z)
Distributed Graph Neural Network Training: A Survey [51.77035975191926]
Graph neural networks (GNNs) are a type of deep learning models that are trained on graphs and have been successfully applied in various domains. Despite the effectiveness of GNNs, it is still challenging for GNNs to efficiently scale to large graphs. As a remedy, distributed computing becomes a promising solution of training large-scale GNNs.
arXiv Detail & Related papers (2022-11-01T01:57:00Z)
Robust Graph Neural Networks using Weighted Graph Laplacian [1.8292714902548342]
Graph neural network (GNN) is vulnerable to noise and adversarial attacks in input data. We propose a generic framework for robustifying GNN known as Weighted Laplacian GNN (RWL-GNN)
arXiv Detail & Related papers (2022-08-03T05:36:35Z)
Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency Analysis [28.464210819376593]
Graph neural networks (GNNs) are among the most powerful tools in deep learning. They routinely solve complex problems on unstructured networks, such as node classification, graph classification, or link prediction, with high accuracy. However, both inference and training of GNNs are complex, and they uniquely combine the features of irregular graph processing with dense and regular computations. This complexity makes it very challenging to execute GNNs efficiently on modern massively parallel architectures.
arXiv Detail & Related papers (2022-05-19T17:11:45Z)
TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs [21.63854538768414]
We propose TC-GNN, the first GNN framework based on GPU Core Units (TCUs) The core idea is to reconcile the "Sparse" GNN with the high-performance "Dense" TCUs. Rigorous experiments show an average of 1.70 speedup over the state-of-the-art DGL framework.
arXiv Detail & Related papers (2021-12-03T18:06:23Z)
VersaGNN: a Versatile accelerator for Graph neural networks [81.1667080640009]
We propose textitVersaGNN, an ultra-efficient, systolic-array-based versatile hardware accelerator. textitVersaGNN achieves on average 3712$times$ speedup with 1301.25$times$ energy reduction on CPU, and 35.4$times$ speedup with 17.66$times$ energy reduction on GPU.
arXiv Detail & Related papers (2021-05-04T04:10:48Z)
BlockGNN: Towards Efficient GNN Acceleration Using Block-Circulant Weight Matrices [9.406007544032848]
Graph Neural Networks (GNNs) are state-of-the-art algorithms for analyzing non-euclidean graph data. How to inference GNNs in real time has become a challenging problem for some resource-limited edge-computing platforms. We propose BlockGNN, a software- hardware co-design approach to realize efficient GNN acceleration.
arXiv Detail & Related papers (2021-04-13T14:09:22Z)
A Unified Lottery Ticket Hypothesis for Graph Neural Networks [82.31087406264437]
We present a unified GNN sparsification (UGS) framework that simultaneously prunes the graph adjacency matrix and the model weights. We further generalize the popular lottery ticket hypothesis to GNNs for the first time, by defining a graph lottery ticket (GLT) as a pair of core sub-dataset and sparse sub-network.
arXiv Detail & Related papers (2021-02-12T21:52:43Z)
GPT-GNN: Generative Pre-Training of Graph Neural Networks [93.35945182085948]
Graph neural networks (GNNs) have been demonstrated to be powerful in modeling graph-structured data. We present the GPT-GNN framework to initialize GNNs by generative pre-training. We show that GPT-GNN significantly outperforms state-of-the-art GNN models without pre-training by up to 9.1% across various downstream tasks.
arXiv Detail & Related papers (2020-06-27T20:12:33Z)
Towards Deeper Graph Neural Networks with Differentiable Group Normalization [61.20639338417576]
Graph neural networks (GNNs) learn the representation of a node by aggregating its neighbors. Over-smoothing is one of the key issues which limit the performance of GNNs as the number of layers increases. We introduce two over-smoothing metrics and a novel technique, i.e., differentiable group normalization (DGN)
arXiv Detail & Related papers (2020-06-12T07:18:02Z)
Eigen-GNN: A Graph Structure Preserving Plug-in for GNNs [95.63153473559865]
Graph Neural Networks (GNNs) are emerging machine learning models on graphs. Most existing GNN models in practice are shallow and essentially feature-centric. We show empirically and analytically that the existing shallow GNNs cannot preserve graph structures well. We propose Eigen-GNN, a plug-in module to boost GNNs ability in preserving graph structures.
arXiv Detail & Related papers (2020-06-08T02:47:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.