Operation-Level Performance Benchmarking of Graph Neural Networks for
Scientific Applications
- URL: http://arxiv.org/abs/2207.09955v1
- Date: Wed, 20 Jul 2022 15:01:12 GMT
- Title: Operation-Level Performance Benchmarking of Graph Neural Networks for
Scientific Applications
- Authors: Ryien Hosseini, Filippo Simini, Venkatram Vishwanath
- Abstract summary: We profile and select low-level operations pertinent to Graph Neural Networks (GNNs) for scientific computing implemented in the Pytorch Geometric software framework.
These are then rigorously benchmarked on NVIDIA A100 GPUs for several various combinations of input values, including tensor sparsity.
At a high level, we conclude that on NVIDIA systems: (1) confounding bottlenecks such as memory inefficiency often dominate runtime costs moreso than data sparsity alone.
We hope that these results serve as a baseline for those developing these operations on specialized hardware and that our subsequent analysis helps to facilitate future software and hardware based optimizations of these operations and
- Score: 0.15469452301122172
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As Graph Neural Networks (GNNs) increase in popularity for scientific machine
learning, their training and inference efficiency is becoming increasingly
critical. Additionally, the deep learning field as a whole is trending towards
wider and deeper networks, and ever increasing data sizes, to the point where
hard hardware bottlenecks are often encountered. Emerging specialty hardware
platforms provide an exciting solution to this problem. In this paper, we
systematically profile and select low-level operations pertinent to GNNs for
scientific computing implemented in the Pytorch Geometric software framework.
These are then rigorously benchmarked on NVIDIA A100 GPUs for several various
combinations of input values, including tensor sparsity. We then analyze these
results for each operation. At a high level, we conclude that on NVIDIA
systems: (1) confounding bottlenecks such as memory inefficiency often dominate
runtime costs moreso than data sparsity alone, (2) native Pytorch operations
are often as or more competitive than their Pytorch Geometric equivalents,
especially at low to moderate levels of input data sparsity, and (3) many
operations central to state-of-the-art GNN architectures have little to no
optimization for sparsity. We hope that these results serve as a baseline for
those developing these operations on specialized hardware and that our
subsequent analysis helps to facilitate future software and hardware based
optimizations of these operations and thus scalable GNN performance as a whole.
Related papers
- AdaptGear: Accelerating GNN Training via Adaptive Subgraph-Level Kernels
on GPUs [26.607519045805745]
Graph neural networks (GNNs) are powerful tools for exploring and learning from graph structures and features.
Prior works have proposed to explore the sparsity in the input graph to accelerate GNNs, which uses the full-graph-level or block-level sparsity format.
We show that they fail to balance the sparsity benefit and kernel execution efficiency.
We propose a novel system, referred to as AdaptGear, that addresses the challenge of optimizing GNNs performance.
arXiv Detail & Related papers (2023-05-27T08:22:12Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - A Comprehensive Study on Large-Scale Graph Training: Benchmarking and
Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs)
We present a new ensembling training manner, named EnGCN, to address the existing issues.
Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z) - Hardware/Software Co-Programmable Framework for Computational SSDs to
Accelerate Deep Learning Service on Large-Scale Graphs [8.698995648930806]
Graph neural networks (GNNs) process large-scale graphs consisting of a hundred billion edges.
We propose a novel deep learning framework on large graphs, HolisticGNN, that provides an easy-to-use, near-storage inference infrastructure for fast, energy-efficient GNN processing.
arXiv Detail & Related papers (2022-01-23T06:08:18Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive
Benchmark Study [100.27567794045045]
Training deep graph neural networks (GNNs) is notoriously hard.
We present the first fair and reproducible benchmark dedicated to assessing the "tricks" of training deep GNNs.
arXiv Detail & Related papers (2021-08-24T05:00:37Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - Learning on Hardware: A Tutorial on Neural Network Accelerators and
Co-Processors [0.0]
Deep neural networks (DNNs) have the advantage that they can take into account a large number of parameters, which enables them to solve complex tasks.
In computer vision and speech recognition, they have a better accuracy than common algorithms, and in some tasks, they boast an even higher accuracy than human experts.
With the progress of DNNs in recent years, many other fields of application such as diagnosis of diseases and autonomous driving are taking advantage of them.
arXiv Detail & Related papers (2021-04-19T12:50:27Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z) - Analyzing the Performance of Graph Neural Networks with Pipe Parallelism [2.269587850533721]
We focus on Graph Neural Networks (GNNs) that have found great success in tasks such as node or edge classification and link prediction.
New approaches for processing larger networks are needed to advance graph techniques.
We study how GNNs could be parallelized using existing tools and frameworks that are known to be successful in the deep learning community.
arXiv Detail & Related papers (2020-12-20T04:20:38Z) - Not Half Bad: Exploring Half-Precision in Graph Convolutional Neural
Networks [8.460826851547294]
efficient graph analysis using modern machine learning is receiving a growing level of attention.
Deep learning approaches often operate over the entire adjacency matrix.
It is desirable to identify efficient measures to reduce both run-time and memory requirements.
arXiv Detail & Related papers (2020-10-23T19:47:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.