GPU Acceleration of Sparse Neural Networks
- URL: http://arxiv.org/abs/2005.04347v1
- Date: Sat, 9 May 2020 02:18:31 GMT
- Title: GPU Acceleration of Sparse Neural Networks
- Authors: Aavaas Gajurel, Sushil J. Louis, Frederick C Harris
- Abstract summary: We show that we can gain significant speedup for full activation of sparse neural networks using graphical processing units.
Our results show that the activation of sparse neural networks lends very well to GPU acceleration and can help speed up machine learning strategies.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we use graphics processing units(GPU) to accelerate sparse and
arbitrary structured neural networks. Sparse networks have nodes in the network
that are not fully connected with nodes in preceding and following layers, and
arbitrary structure neural networks have different number of nodes in each
layers. Sparse Neural networks with arbitrary structures are generally created
in the processes like neural network pruning and evolutionary machine learning
strategies. We show that we can gain significant speedup for full activation of
such neural networks using graphical processing units. We do a prepossessing
step to determine dependency groups for all the nodes in a network, and use
that information to guide the progression of activation in the neural network.
Then we compute activation for each nodes in its own separate thread in the
GPU, which allows for massive parallelization. We use CUDA framework to
implement our approach and compare the results of sequential and GPU
implementations. Our results show that the activation of sparse neural networks
lends very well to GPU acceleration and can help speed up machine learning
strategies which generate such networks or other processes that have similar
structure.
Related papers
- Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - DEFER: Distributed Edge Inference for Deep Neural Networks [5.672898304129217]
We present DEFER, a framework for distributed edge inference.
It partitions deep neural networks into layers that can be spread across multiple compute nodes.
We find that for the ResNet50 model, the inference throughput of DEFER with 8 compute nodes is 53% higher and per node energy consumption is 63% lower than single device inference.
arXiv Detail & Related papers (2022-01-18T06:50:45Z) - Instant Neural Graphics Primitives with a Multiresolution Hash Encoding [67.33850633281803]
We present a versatile new input encoding that permits the use of a smaller network without sacrificing quality.
A small neural network is augmented by a multiresolution hash table of trainable feature vectors whose values are optimized through a gradient descent.
We achieve a combined speed of several orders of magnitude, enabling training of high-quality neural graphics primitives in a matter of seconds.
arXiv Detail & Related papers (2022-01-16T07:22:47Z) - Multirate Training of Neural Networks [0.0]
We show that for various transfer learning applications in vision and NLP we can fine-tune deep neural networks in almost half the time.
We propose an additional multirate technique which can learn different features present in the data by training the full network on different time scales simultaneously.
arXiv Detail & Related papers (2021-06-20T22:44:55Z) - ItNet: iterative neural networks with small graphs for accurate and
efficient anytime prediction [1.52292571922932]
In this study, we introduce a class of network models that have a small memory footprint in terms of their computational graphs.
We show state-of-the-art results for semantic segmentation on the CamVid and Cityscapes datasets.
arXiv Detail & Related papers (2021-01-21T15:56:29Z) - SparseDNN: Fast Sparse Deep Learning Inference on CPUs [1.6244541005112747]
We present SparseDNN, a sparse deep learning inference engine targeting CPUs.
We show that our sparse code generator can achieve significant speedups over state-of-the-art sparse and dense libraries.
arXiv Detail & Related papers (2021-01-20T03:27:35Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z) - Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks [78.65792427542672]
Dynamic Graph Network (DG-Net) is a complete directed acyclic graph, where the nodes represent convolutional blocks and the edges represent connection paths.
Instead of using the same path of the network, DG-Net aggregates features dynamically in each node, which allows the network to have more representation ability.
arXiv Detail & Related papers (2020-10-02T16:50:26Z) - Optimizing Memory Placement using Evolutionary Graph Reinforcement
Learning [56.83172249278467]
We introduce Evolutionary Graph Reinforcement Learning (EGRL), a method designed for large search spaces.
We train and validate our approach directly on the Intel NNP-I chip for inference.
We additionally achieve 28-78% speed-up compared to the native NNP-I compiler on all three workloads.
arXiv Detail & Related papers (2020-07-14T18:50:12Z) - Brief Announcement: On the Limits of Parallelizing Convolutional Neural
Networks on GPUs [0.45740558095423056]
Training a deep neural network (DNN) is a time-consuming process even on GPUs because of the massive number of parameters that have to be learned.
We make a case for the need and potential benefit of exploiting this rich parallelism in state-of-the-art non-linear networks for reducing the training time.
arXiv Detail & Related papers (2020-05-28T07:51:22Z) - EdgeNets:Edge Varying Graph Neural Networks [179.99395949679547]
This paper puts forth a general framework that unifies state-of-the-art graph neural networks (GNNs) through the concept of EdgeNet.
An EdgeNet is a GNN architecture that allows different nodes to use different parameters to weigh the information of different neighbors.
This is a general linear and local operation that a node can perform and encompasses under one formulation all existing graph convolutional neural networks (GCNNs) as well as graph attention networks (GATs)
arXiv Detail & Related papers (2020-01-21T15:51:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.