DeepReduce: A Sparse-tensor Communication Framework for Distributed Deep
Learning
- URL: http://arxiv.org/abs/2102.03112v1
- Date: Fri, 5 Feb 2021 11:31:24 GMT
- Title: DeepReduce: A Sparse-tensor Communication Framework for Distributed Deep
Learning
- Authors: Kelly Kostopoulou, Hang Xu, Aritra Dutta, Xin Li, Alexandros Ntoulas,
Panos Kalnis
- Abstract summary: This paper introduces DeepReduce, a versatile framework for the compressed communication of sparse tensors.
DeepReduce decomposes tensors in two sets, values and indices, and allows both independent and combined compression of these sets.
Our experiments with large real models demonstrate that DeepReduce transmits fewer data and imposes lower computational overhead than existing methods.
- Score: 79.89085533866071
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Sparse tensors appear frequently in distributed deep learning, either as a
direct artifact of the deep neural network's gradients, or as a result of an
explicit sparsification process. Existing communication primitives are agnostic
to the peculiarities of deep learning; consequently, they impose unnecessary
communication overhead. This paper introduces DeepReduce, a versatile framework
for the compressed communication of sparse tensors, tailored for distributed
deep learning. DeepReduce decomposes sparse tensors in two sets, values and
indices, and allows both independent and combined compression of these sets. We
support a variety of common compressors, such as Deflate for values, or
run-length encoding for indices. We also propose two novel compression schemes
that achieve superior results: curve fitting-based for values and bloom
filter-based for indices. DeepReduce is orthogonal to existing gradient
sparsifiers and can be applied in conjunction with them, transparently to the
end-user, to significantly lower the communication overhead. As proof of
concept, we implement our approach on Tensorflow and PyTorch. Our experiments
with large real models demonstrate that DeepReduce transmits fewer data and
imposes lower computational overhead than existing methods, without affecting
the training accuracy.
Related papers
- Low-rank Tensor Decomposition for Compression of Convolutional Neural
Networks Using Funnel Regularization [1.8579693774597708]
We propose a model reduction method to compress the pre-trained networks using low-rank tensor decomposition.
A new regularization method, called funnel function, is proposed to suppress the unimportant factors during the compression.
For ResNet18 with ImageNet2012, our reduced model can reach more than twi times speed up in terms of GMAC with merely 0.7% Top-1 accuracy drop.
arXiv Detail & Related papers (2021-12-07T13:41:51Z) - Optimization-Based Separations for Neural Networks [57.875347246373956]
We show that gradient descent can efficiently learn ball indicator functions using a depth 2 neural network with two layers of sigmoidal activations.
This is the first optimization-based separation result where the approximation benefits of the stronger architecture provably manifest in practice.
arXiv Detail & Related papers (2021-12-04T18:07:47Z) - Permute, Quantize, and Fine-tune: Efficient Compression of Neural
Networks [70.0243910593064]
Key to success of vector quantization is deciding which parameter groups should be compressed together.
In this paper we make the observation that the weights of two adjacent layers can be permuted while expressing the same function.
We then establish a connection to rate-distortion theory and search for permutations that result in networks that are easier to compress.
arXiv Detail & Related papers (2020-10-29T15:47:26Z) - Unfolding Neural Networks for Compressive Multichannel Blind
Deconvolution [71.29848468762789]
We propose a learned-structured unfolding neural network for the problem of compressive sparse multichannel blind-deconvolution.
In this problem, each channel's measurements are given as convolution of a common source signal and sparse filter.
We demonstrate that our method is superior to classical structured compressive sparse multichannel blind-deconvolution methods in terms of accuracy and speed of sparse filter recovery.
arXiv Detail & Related papers (2020-10-22T02:34:33Z) - A Partial Regularization Method for Network Compression [0.0]
We propose an approach of partial regularization rather than the original form of penalizing all parameters, which is said to be full regularization, to conduct model compression at a higher speed.
Experimental results show that as we expected, the computational complexity is reduced by observing less running time in almost all situations.
Surprisingly, it helps to improve some important metrics such as regression fitting results and classification accuracy in both training and test phases on multiple datasets.
arXiv Detail & Related papers (2020-09-03T00:38:27Z) - ESPN: Extremely Sparse Pruned Networks [50.436905934791035]
We show that a simple iterative mask discovery method can achieve state-of-the-art compression of very deep networks.
Our algorithm represents a hybrid approach between single shot network pruning methods and Lottery-Ticket type approaches.
arXiv Detail & Related papers (2020-06-28T23:09:27Z) - Understanding Generalization in Deep Learning via Tensor Methods [53.808840694241]
We advance the understanding of the relations between the network's architecture and its generalizability from the compression perspective.
We propose a series of intuitive, data-dependent and easily-measurable properties that tightly characterize the compressibility and generalizability of neural networks.
arXiv Detail & Related papers (2020-01-14T22:26:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.