Benchmarking GPU and TPU Performance with Graph Neural Networks
- URL: http://arxiv.org/abs/2210.12247v1
- Date: Fri, 21 Oct 2022 21:03:40 GMT
- Title: Benchmarking GPU and TPU Performance with Graph Neural Networks
- Authors: xiangyang Ju, Yunsong Wang, Daniel Murnane, Nicholas Choma, Steven
Farrell, Paolo Calafiura
- Abstract summary: This work analyzes and compares the GPU and TPU performance training a Graph Neural Network (GNN) developed to solve a real-life pattern recognition problem.
Characterizing the new class of models acting on sparse data may prove helpful in optimizing the design of deep learning libraries and future AI accelerators.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Many artificial intelligence (AI) devices have been developed to accelerate
the training and inference of neural networks models. The most common ones are
the Graphics Processing Unit (GPU) and Tensor Processing Unit (TPU). They are
highly optimized for dense data representations. However, sparse
representations such as graphs are prevalent in many domains, including
science. It is therefore important to characterize the performance of available
AI accelerators on sparse data. This work analyzes and compares the GPU and TPU
performance training a Graph Neural Network (GNN) developed to solve a
real-life pattern recognition problem. Characterizing the new class of models
acting on sparse data may prove helpful in optimizing the design of deep
learning libraries and future AI accelerators.
Related papers
- Harnessing Manycore Processors with Distributed Memory for Accelerated
Training of Sparse and Recurrent Models [43.1773057439246]
Current AI training infrastructure is dominated by single instruction multiple data (SIMD) and systolic array architectures.
We explore sparse and recurrent model training on a massively parallel multiple instruction multiple data architecture with distributed local memory.
arXiv Detail & Related papers (2023-11-07T23:18:35Z) - RGCVAE: Relational Graph Conditioned Variational Autoencoder for
Molecule Design [70.59828655929194]
Deep Graph Variational Autoencoders are among the most powerful machine learning tools with which it is possible to address this problem.
We propose RGCVAE, an efficient and effective Graph Variational Autoencoder based on: (i) an encoding network exploiting a new powerful Graph Isomorphism Network; (ii) a novel probabilistic decoding component.
arXiv Detail & Related papers (2023-05-19T14:23:48Z) - Architectural Implications of Embedding Dimension during GCN on CPU and
GPU [6.650945912906685]
Graph Convolutional Networks (GCNs) are a widely used type of GNN for transductive graph learning problems.
GCN is a challenging algorithm from an architecture perspective due to inherent sparsity, low data reuse, and massive memory capacity requirements.
arXiv Detail & Related papers (2022-12-01T19:23:12Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - A Comprehensive Study on Large-Scale Graph Training: Benchmarking and
Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs)
We present a new ensembling training manner, named EnGCN, to address the existing issues.
Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and
Preprocessing [0.0]
Graph neural networks (GNNs) have extended the success of deep neural networks (DNNs) to non-Euclidean graph data.
Existing systems are inefficient to train large graphs with billions of nodes and edges with GPUs.
This paper proposes BGL, a distributed GNN training system designed to address the bottlenecks with a few key ideas.
arXiv Detail & Related papers (2021-12-16T00:37:37Z) - Learning on Hardware: A Tutorial on Neural Network Accelerators and
Co-Processors [0.0]
Deep neural networks (DNNs) have the advantage that they can take into account a large number of parameters, which enables them to solve complex tasks.
In computer vision and speech recognition, they have a better accuracy than common algorithms, and in some tasks, they boast an even higher accuracy than human experts.
With the progress of DNNs in recent years, many other fields of application such as diagnosis of diseases and autonomous driving are taking advantage of them.
arXiv Detail & Related papers (2021-04-19T12:50:27Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z) - Not Half Bad: Exploring Half-Precision in Graph Convolutional Neural
Networks [8.460826851547294]
efficient graph analysis using modern machine learning is receiving a growing level of attention.
Deep learning approaches often operate over the entire adjacency matrix.
It is desirable to identify efficient measures to reduce both run-time and memory requirements.
arXiv Detail & Related papers (2020-10-23T19:47:42Z) - Optimizing Memory Placement using Evolutionary Graph Reinforcement
Learning [56.83172249278467]
We introduce Evolutionary Graph Reinforcement Learning (EGRL), a method designed for large search spaces.
We train and validate our approach directly on the Intel NNP-I chip for inference.
We additionally achieve 28-78% speed-up compared to the native NNP-I compiler on all three workloads.
arXiv Detail & Related papers (2020-07-14T18:50:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.