Pruning In Time (PIT): A Lightweight Network Architecture Optimizer for
Temporal Convolutional Networks
- URL: http://arxiv.org/abs/2203.14768v1
- Date: Mon, 28 Mar 2022 14:03:16 GMT
- Title: Pruning In Time (PIT): A Lightweight Network Architecture Optimizer for
Temporal Convolutional Networks
- Authors: Matteo Risso, Alessio Burrello, Daniele Jahier Pagliari, Francesco
Conti, Lorenzo Lamberti, Enrico Macii, Luca Benini, Massimo Poncino
- Abstract summary: Temporal Convolutional Networks (TCNs) are promising Deep Learning models for time-series processing tasks.
We propose an automatic dilation, which tackles the problem as a weight pruning on the time-axis, and learns dilation factors together with weights, in a single training.
- Score: 20.943095081056857
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Temporal Convolutional Networks (TCNs) are promising Deep Learning models for
time-series processing tasks. One key feature of TCNs is time-dilated
convolution, whose optimization requires extensive experimentation. We propose
an automatic dilation optimizer, which tackles the problem as a weight pruning
on the time-axis, and learns dilation factors together with weights, in a
single training. Our method reduces the model size and inference latency on a
real SoC hardware target by up to 7.4x and 3x, respectively with no accuracy
drop compared to a network without dilation. It also yields a rich set of
Pareto-optimal TCNs starting from a single model, outperforming hand-designed
solutions in both size and accuracy.
Related papers
- Accelerating Deep Neural Networks via Semi-Structured Activation
Sparsity [0.0]
Exploiting sparsity in the network's feature maps is one of the ways to reduce its inference latency.
We propose a solution to induce semi-structured activation sparsity exploitable through minor runtime modifications.
Our approach yields a speed improvement of $1.25 times$ with a minimal accuracy drop of $1.1%$ for the ResNet18 model on the ImageNet dataset.
arXiv Detail & Related papers (2023-09-12T22:28:53Z) - A Generalization of Continuous Relaxation in Structured Pruning [0.3277163122167434]
Trends indicate that deeper and larger neural networks with an increasing number of parameters achieve higher accuracy than smaller neural networks.
We generalize structured pruning with algorithms for network augmentation, pruning, sub-network collapse and removal.
The resulting CNN executes efficiently on GPU hardware without computationally expensive sparse matrix operations.
arXiv Detail & Related papers (2023-08-28T14:19:13Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - Fast as CHITA: Neural Network Pruning with Combinatorial Optimization [9.440450886684603]
We propose a novel optimization-based pruning framework that considers the combined effect of pruning (and updating) multiple weights subject to a sparsity constraint.
Our approach, CHITA, extends the classical Brain Surgeon framework and results in significant improvements in speed, memory, and performance.
arXiv Detail & Related papers (2023-02-28T15:03:18Z) - Teal: Learning-Accelerated Optimization of WAN Traffic Engineering [68.7863363109948]
We present Teal, a learning-based TE algorithm that leverages the parallel processing power of GPUs to accelerate TE control.
To reduce the problem scale and make learning tractable, Teal employs a multi-agent reinforcement learning (RL) algorithm to independently allocate each traffic demand.
Compared with other TE acceleration schemes, Teal satisfies 6--32% more traffic demand and yields 197--625x speedups.
arXiv Detail & Related papers (2022-10-25T04:46:30Z) - Joint inference and input optimization in equilibrium networks [68.63726855991052]
deep equilibrium model is a class of models that foregoes traditional network depth and instead computes the output of a network by finding the fixed point of a single nonlinear layer.
We show that there is a natural synergy between these two settings.
We demonstrate this strategy on various tasks such as training generative models while optimizing over latent codes, training models for inverse problems like denoising and inpainting, adversarial training and gradient based meta-learning.
arXiv Detail & Related papers (2021-11-25T19:59:33Z) - LCS: Learning Compressible Subspaces for Adaptive Network Compression at
Inference Time [57.52251547365967]
We propose a method for training a "compressible subspace" of neural networks that contains a fine-grained spectrum of models.
We present results for achieving arbitrarily fine-grained accuracy-efficiency trade-offs at inference time for structured and unstructured sparsity.
Our algorithm extends to quantization at variable bit widths, achieving accuracy on par with individually trained networks.
arXiv Detail & Related papers (2021-10-08T17:03:34Z) - Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch [75.69506249886622]
Sparsity in Deep Neural Networks (DNNs) has been widely studied to compress and accelerate the models on resource-constrained environments.
In this paper, we are the first to study training from scratch an N:M fine-grained structured sparse network.
arXiv Detail & Related papers (2021-02-08T05:55:47Z) - How Not to Give a FLOP: Combining Regularization and Pruning for
Efficient Inference [0.0]
In this paper, we examine the use of both regularization and pruning for reduced computational complexity and more efficient inference in Deep Neural Networks (DNNs)
By using regularization in conjunction with network pruning, we show that such a combination makes a substantial improvement over each of the two techniques individually.
arXiv Detail & Related papers (2020-03-30T16:20:46Z) - Toward fast and accurate human pose estimation via soft-gated skip
connections [97.06882200076096]
This paper is on highly accurate and highly efficient human pose estimation.
We re-analyze this design choice in the context of improving both the accuracy and the efficiency over the state-of-the-art.
Our model achieves state-of-the-art results on the MPII and LSP datasets.
arXiv Detail & Related papers (2020-02-25T18:51:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.