Brief Announcement: On the Limits of Parallelizing Convolutional Neural
Networks on GPUs
- URL: http://arxiv.org/abs/2005.13823v1
- Date: Thu, 28 May 2020 07:51:22 GMT
- Title: Brief Announcement: On the Limits of Parallelizing Convolutional Neural
Networks on GPUs
- Authors: Behnam Pourghassemi (1), Chenghao Zhang (1), Joo Hwan Lee (2), Aparna
Chandramowlishwaran (1) ((1) University of California, Irvine, (2) Samsung
Semiconductor)
- Abstract summary: Training a deep neural network (DNN) is a time-consuming process even on GPUs because of the massive number of parameters that have to be learned.
We make a case for the need and potential benefit of exploiting this rich parallelism in state-of-the-art non-linear networks for reducing the training time.
- Score: 0.45740558095423056
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: GPUs are currently the platform of choice for training neural networks.
However, training a deep neural network (DNN) is a time-consuming process even
on GPUs because of the massive number of parameters that have to be learned. As
a result, accelerating DNN training has been an area of significant research in
the last couple of years.
While earlier networks such as AlexNet had a linear dependency between layers
and operations, state-of-the-art networks such as ResNet, PathNet, and
GoogleNet have a non-linear structure that exhibits a higher level of
inter-operation parallelism. However, popular deep learning (DL) frameworks
such as TensorFlow and PyTorch launch the majority of neural network
operations, especially convolutions, serially on GPUs and do not exploit this
inter-op parallelism. In this brief announcement, we make a case for the need
and potential benefit of exploiting this rich parallelism in state-of-the-art
non-linear networks for reducing the training time. We identify the challenges
and limitations in enabling concurrent layer execution on GPU backends (such as
cuDNN) of DL frameworks and propose potential solutions.
Related papers
- Algebraic Representations for Faster Predictions in Convolutional Neural Networks [0.0]
Convolutional neural networks (CNNs) are a popular choice of model for tasks in computer vision.
skip connections may be added to create an easier gradient optimization problem.
We show that arbitrarily complex, trained, linear CNNs with skip connections can be simplified into a single-layer model.
arXiv Detail & Related papers (2024-08-14T21:10:05Z) - Spyx: A Library for Just-In-Time Compiled Optimization of Spiking Neural
Networks [0.08965418284317034]
Spiking Neural Networks (SNNs) offer to enhance energy efficiency through a reduced and low-power hardware footprint.
This paper introduces Spyx, a new and lightweight SNN simulation and optimization library designed in JAX.
arXiv Detail & Related papers (2024-02-29T09:46:44Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Parareal Neural Networks Emulating a Parallel-in-time Algorithm [1.988145627448243]
As deep neural networks (DNNs) become deeper, the training time increases.
In this paper, we introduce a novel methodology to construct a parallel neural network.
arXiv Detail & Related papers (2021-03-16T02:03:39Z) - ItNet: iterative neural networks with small graphs for accurate and
efficient anytime prediction [1.52292571922932]
In this study, we introduce a class of network models that have a small memory footprint in terms of their computational graphs.
We show state-of-the-art results for semantic segmentation on the CamVid and Cityscapes datasets.
arXiv Detail & Related papers (2021-01-21T15:56:29Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z) - ShiftAddNet: A Hardware-Inspired Deep Network [87.18216601210763]
ShiftAddNet is an energy-efficient multiplication-less deep neural network.
It leads to both energy-efficient inference and training, without compromising expressive capacity.
ShiftAddNet aggressively reduces over 80% hardware-quantified energy cost of DNNs training and inference, while offering comparable or better accuracies.
arXiv Detail & Related papers (2020-10-24T05:09:14Z) - Wide and Deep Graph Neural Networks with Distributed Online Learning [175.96910854433574]
Graph neural networks (GNNs) learn representations from network data with naturally distributed architectures.
Online learning can be used to retrain GNNs at testing time, overcoming this issue.
This paper proposes the Wide and Deep GNN (WD-GNN), a novel architecture that can be easily updated with distributed online learning mechanisms.
arXiv Detail & Related papers (2020-06-11T12:48:03Z) - Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G
Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC.
To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z) - Large-Scale Gradient-Free Deep Learning with Recursive Local
Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources.
Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize.
We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.