Learned Threshold Pruning
- URL: http://arxiv.org/abs/2003.00075v2
- Date: Fri, 19 Mar 2021 02:36:29 GMT
- Title: Learned Threshold Pruning
- Authors: Kambiz Azarian, Yash Bhalgat, Jinwon Lee and Tijmen Blankevoort
- Abstract summary: Our method learns per-layer thresholds via gradient descent, unlike conventional methods where they are set as input.
It takes $30$ epochs for tuning to prune ResNet50 on ImageNet by a factor of $9.1$.
We also show that tuning effectively prunes modern textitcompactthreshold architectures such as EfficientNet, MobileNetV2 and MixNet.
- Score: 15.394473766381518
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a novel differentiable method for unstructured weight
pruning of deep neural networks. Our learned-threshold pruning (LTP) method
learns per-layer thresholds via gradient descent, unlike conventional methods
where they are set as input. Making thresholds trainable also makes LTP
computationally efficient, hence scalable to deeper networks. For example, it
takes $30$ epochs for LTP to prune ResNet50 on ImageNet by a factor of $9.1$.
This is in contrast to other methods that search for per-layer thresholds via a
computationally intensive iterative pruning and fine-tuning process.
Additionally, with a novel differentiable $L_0$ regularization, LTP is able to
operate effectively on architectures with batch-normalization. This is
important since $L_1$ and $L_2$ penalties lose their regularizing effect in
networks with batch-normalization. Finally, LTP generates a trail of
progressively sparser networks from which the desired pruned network can be
picked based on sparsity and performance requirements. These features allow LTP
to achieve competitive compression rates on ImageNet networks such as AlexNet
($26.4\times$ compression with $79.1\%$ Top-5 accuracy) and ResNet50
($9.1\times$ compression with $92.0\%$ Top-5 accuracy). We also show that LTP
effectively prunes modern \textit{compact} architectures, such as EfficientNet,
MobileNetV2 and MixNet.
Related papers
- End-to-End Neural Network Compression via $\frac{\ell_1}{\ell_2}$
Regularized Latency Surrogates [20.31383698391339]
Our algorithm is versatile and can be used with many popular compression methods including pruning, low-rank factorization, and quantization.
It is fast and runs in almost the same amount of time as single model training.
arXiv Detail & Related papers (2023-06-09T09:57:17Z) - Lightweight and Progressively-Scalable Networks for Semantic
Segmentation [100.63114424262234]
Multi-scale learning frameworks have been regarded as a capable class of models to boost semantic segmentation.
In this paper, we thoroughly analyze the design of convolutional blocks and the ways of interactions across multiple scales.
We devise Lightweight and Progressively-Scalable Networks (LPS-Net) that novelly expands the network complexity in a greedy manner.
arXiv Detail & Related papers (2022-07-27T16:00:28Z) - Trainability Preserving Neural Structured Pruning [64.65659982877891]
We present trainability preserving pruning (TPP), a regularization-based structured pruning method that can effectively maintain trainability during sparsification.
TPP can compete with the ground-truth dynamical isometry recovery method on linear networks.
It delivers encouraging performance in comparison to many top-performing filter pruning methods.
arXiv Detail & Related papers (2022-07-25T21:15:47Z) - DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and
Transformers [105.74546828182834]
We show a hardware-efficient dynamic inference regime, named dynamic weight slicing, which adaptively slice a part of network parameters for inputs with diverse difficulty levels.
We present dynamic slimmable network (DS-Net) and dynamic slice-able network (DS-Net++) by input-dependently adjusting filter numbers of CNNs and multiple dimensions in both CNNs and transformers.
arXiv Detail & Related papers (2021-09-21T09:57:21Z) - Dep-$L_0$: Improving $L_0$-based Network Sparsification via Dependency
Modeling [6.081082481356211]
Training deep neural networks with an $L_0$ regularization is one of the prominent approaches for network pruning or sparsification.
We show that this method performs inconsistently on large-scale learning tasks, such as ResNet50 on ImageNet.
We propose a dependency modeling of binary gates, which can be modeled effectively as a multi-layer perceptron.
arXiv Detail & Related papers (2021-06-30T19:33:35Z) - Network Pruning via Resource Reallocation [75.85066435085595]
We propose a simple yet effective channel pruning technique, termed network Pruning via rEsource rEalLocation (PEEL)
PEEL first constructs a predefined backbone and then conducts resource reallocation on it to shift parameters from less informative layers to more important layers in one round.
Experimental results show that structures uncovered by PEEL exhibit competitive performance with state-of-the-art pruning algorithms under various pruning settings.
arXiv Detail & Related papers (2021-03-02T16:28:10Z) - Network Automatic Pruning: Start NAP and Take a Nap [94.14675930881366]
We propose NAP, a unified and automatic pruning framework for both fine-grained and structured pruning.
It can find out unimportant components of a network and automatically decide appropriate compression ratios for different layers.
Despite its simpleness to use, NAP outperforms previous pruning methods by large margins.
arXiv Detail & Related papers (2021-01-17T07:09:19Z) - Single-path Bit Sharing for Automatic Loss-aware Model Compression [126.98903867768732]
Single-path Bit Sharing (SBS) is able to significantly reduce computational cost while achieving promising performance.
Our SBS compressed MobileNetV2 achieves 22.6x Bit-Operation (BOP) reduction with only 0.1% drop in the Top-1 accuracy.
arXiv Detail & Related papers (2021-01-13T08:28:21Z) - Knapsack Pruning with Inner Distillation [11.04321604965426]
We propose a novel pruning method that optimize the final accuracy of the pruned network.
We prune the network channels while maintaining the high-level structure of the network.
Our method leads to state-of-the-art pruning results on ImageNet, CIFAR-10 and CIFAR-100 using ResNet backbones.
arXiv Detail & Related papers (2020-02-19T16:04:48Z) - Activation Density driven Energy-Efficient Pruning in Training [2.222917681321253]
We propose a novel pruning method that prunes a network real-time during training.
We obtain exceedingly sparse networks with accuracy comparable to the baseline network.
arXiv Detail & Related papers (2020-02-07T18:34:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.