Accelerating Neural Network Training with Distributed Asynchronous and
Selective Optimization (DASO)
- URL: http://arxiv.org/abs/2104.05588v2
- Date: Thu, 15 Apr 2021 09:37:04 GMT
- Title: Accelerating Neural Network Training with Distributed Asynchronous and
Selective Optimization (DASO)
- Authors: Daniel Coquelin, Charlotte Debus, Markus G\"otz, Fabrice von der Lehr,
James Kahn, Martin Siggel, and Achim Streit
- Abstract summary: We introduce the Distributed Asynchronous and Selective Optimization (DASO) method to accelerate network training.
DASO uses a hierarchical and asynchronous communication scheme comprised of node-local and global networks.
We show that DASO yields a reduction in training time of up to 34% on classical and state-of-the-art networks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With increasing data and model complexities, the time required to train
neural networks has become prohibitively large. To address the exponential rise
in training time, users are turning to data parallel neural networks (DPNN) to
utilize large-scale distributed resources on computer clusters. Current DPNN
approaches implement the network parameter updates by synchronizing and
averaging gradients across all processes with blocking communication
operations. This synchronization is the central algorithmic bottleneck. To
combat this, we introduce the Distributed Asynchronous and Selective
Optimization (DASO) method which leverages multi-GPU compute node architectures
to accelerate network training. DASO uses a hierarchical and asynchronous
communication scheme comprised of node-local and global networks while
adjusting the global synchronization rate during the learning process. We show
that DASO yields a reduction in training time of up to 34% on classical and
state-of-the-art networks, as compared to other existing data parallel training
methods.
Related papers
- SpikePipe: Accelerated Training of Spiking Neural Networks via Inter-Layer Pipelining and Multiprocessor Scheduling [5.2831841848274985]
Training Spiking Neural Networks (SNNs) is computationally expensive compared to their conventional counterparts.
This is the first paper to propose inter-layer pipelining to accelerate training in SNNs using systolic array-based processors and multiprocessor scheduling.
arXiv Detail & Related papers (2024-06-11T01:43:45Z) - Going Forward-Forward in Distributed Deep Learning [0.0]
We introduce a new approach in distributed deep learning, utilizing Geoffrey Hinton's Forward-Forward (FF) algorithm.
Unlike traditional methods that rely on forward and backward passes, the FF algorithm employs a dual forward pass strategy.
Our evaluation shows a 3.75 times speed up on MNIST dataset without compromising accuracy when training a four-layer network with four compute nodes.
arXiv Detail & Related papers (2024-03-30T16:02:53Z) - Ravnest: Decentralized Asynchronous Training on Heterogeneous Devices [0.0]
Ravnest facilitates decentralized training by efficiently organizing compute nodes into clusters.
We have framed our asynchronous SGD loss function as a block structured optimization problem with delayed updates.
arXiv Detail & Related papers (2024-01-03T13:07:07Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Locally Asynchronous Stochastic Gradient Descent for Decentralised Deep
Learning [0.0]
Local Asynchronous SGD (LASGD) is an asynchronous decentralized algorithm that relies on All Reduce for model synchronization.
We empirically validate LASGD's performance on image classification tasks on the ImageNet dataset.
arXiv Detail & Related papers (2022-03-24T14:25:15Z) - Unsupervised Learning for Asynchronous Resource Allocation in Ad-hoc
Wireless Networks [122.42812336946756]
We design an unsupervised learning method based on Aggregation Graph Neural Networks (Agg-GNNs)
We capture the asynchrony by modeling the activation pattern as a characteristic of each node and train a policy-based resource allocation method.
arXiv Detail & Related papers (2020-11-05T03:38:36Z) - A Low Complexity Decentralized Neural Net with Centralized Equivalence
using Layer-wise Learning [49.15799302636519]
We design a low complexity decentralized learning algorithm to train a recently proposed large neural network in distributed processing nodes (workers)
In our setup, the training data is distributed among the workers but is not shared in the training process due to privacy and security concerns.
We show that it is possible to achieve equivalent learning performance as if the data is available in a single place.
arXiv Detail & Related papers (2020-09-29T13:08:12Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Understanding the Effects of Data Parallelism and Sparsity on Neural
Network Training [126.49572353148262]
We study two factors in neural network training: data parallelism and sparsity.
Despite their promising benefits, understanding of their effects on neural network training remains elusive.
arXiv Detail & Related papers (2020-03-25T10:49:22Z) - Large-Scale Gradient-Free Deep Learning with Recursive Local
Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources.
Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize.
We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.