Partitioning sparse deep neural networks for scalable training and
inference
- URL: http://arxiv.org/abs/2104.11805v1
- Date: Fri, 23 Apr 2021 20:05:52 GMT
- Title: Partitioning sparse deep neural networks for scalable training and
inference
- Authors: Gunduz Vehbi Demirci, Hakan Ferhatosmanoglu
- Abstract summary: State-of-the-art deep neural networks (DNNs) have significant computational and data management requirements.
Sparsification and pruning methods are shown to be effective in removing a large fraction of connections in DNNs.
The resulting sparse networks present unique challenges to further improve the computational efficiency of training and inference in deep learning.
- Score: 8.282177703075453
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The state-of-the-art deep neural networks (DNNs) have significant
computational and data management requirements. The size of both training data
and models continue to increase. Sparsification and pruning methods are shown
to be effective in removing a large fraction of connections in DNNs. The
resulting sparse networks present unique challenges to further improve the
computational efficiency of training and inference in deep learning. Both the
feedforward (inference) and backpropagation steps in stochastic gradient
descent (SGD) algorithm for training sparse DNNs involve consecutive sparse
matrix-vector multiplications (SpMVs). We first introduce a distributed-memory
parallel SpMV-based solution for the SGD algorithm to improve its scalability.
The parallelization approach is based on row-wise partitioning of weight
matrices that represent neuron connections between consecutive layers. We then
propose a novel hypergraph model for partitioning weight matrices to reduce the
total communication volume and ensure computational load-balance among
processors. Experiments performed on sparse DNNs demonstrate that the proposed
solution is highly efficient and scalable. By utilizing the proposed matrix
partitioning scheme, the performance of our solution is further improved
significantly.
Related papers
- Event-based backpropagation on the neuromorphic platform SpiNNaker2 [1.0597501054401728]
EventProp is an algorithm for event-based backpropagation in spiking neural networks (SNNs)
Our implementation computes multi-layer networks of leaky integrate-and-fire neurons using discretized versions of the differential equations and their adjoints.
We demonstrate a proof-of-concept of batch-parallelized, on-chip training of SNNs using the Yin Yang dataset.
arXiv Detail & Related papers (2024-12-19T16:31:42Z) - GDSG: Graph Diffusion-based Solution Generator for Optimization Problems in MEC Networks [109.17835015018532]
We present a Graph Diffusion-based Solution Generation (GDSG) method.
This approach is designed to work with suboptimal datasets while converging to the optimal solution large probably.
We build GDSG as a multi-task diffusion model utilizing a Graph Neural Network (GNN) to acquire the distribution of high-quality solutions.
arXiv Detail & Related papers (2024-12-11T11:13:43Z) - Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval.
A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed.
The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - A Low-Complexity Approach to Rate-Distortion Optimized Variable Bit-Rate
Compression for Split DNN Computing [5.3221129103999125]
Split computing has emerged as a recent paradigm for implementation of DNN-based AI workloads.
We present an approach that addresses the challenge of optimizing the rate-accuracy-complexity trade-off.
Our approach is remarkably lightweight, both during training and inference, highly effective and achieves excellent rate-distortion performance.
arXiv Detail & Related papers (2022-08-24T15:02:11Z) - Efficient and Robust Mixed-Integer Optimization Methods for Training
Binarized Deep Neural Networks [0.07614628596146598]
We study deep neural networks with binary activation functions and continuous or integer weights (BDNN)
We show that the BDNN can be reformulated as a mixed-integer linear program with bounded weight space which can be solved to global optimality by classical mixed-integer programming solvers.
For the first time a robust model is presented which enforces robustness of the BDNN during training.
arXiv Detail & Related papers (2021-10-21T18:02:58Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.