A "Network Pruning Network" Approach to Deep Model Compression
- URL: http://arxiv.org/abs/2001.05545v1
- Date: Wed, 15 Jan 2020 20:38:23 GMT
- Title: A "Network Pruning Network" Approach to Deep Model Compression
- Authors: Vinay Kumar Verma, Pravendra Singh, Vinay P. Namboodiri, Piyush Rai
- Abstract summary: We present a filter pruning approach for deep model compression using a multitask network.
Our approach is based on learning a a pruner network to prune a pre-trained target network.
The compressed model produced by our approach is generic and does not need any special hardware/software support.
- Score: 62.68120664998911
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a filter pruning approach for deep model compression, using a
multitask network. Our approach is based on learning a a pruner network to
prune a pre-trained target network. The pruner is essentially a multitask deep
neural network with binary outputs that help identify the filters from each
layer of the original network that do not have any significant contribution to
the model and can therefore be pruned. The pruner network has the same
architecture as the original network except that it has a
multitask/multi-output last layer containing binary-valued outputs (one per
filter), which indicate which filters have to be pruned. The pruner's goal is
to minimize the number of filters from the original network by assigning zero
weights to the corresponding output feature-maps. In contrast to most of the
existing methods, instead of relying on iterative pruning, our approach can
prune the network (original network) in one go and, moreover, does not require
specifying the degree of pruning for each layer (and can learn it instead). The
compressed model produced by our approach is generic and does not need any
special hardware/software support. Moreover, augmenting with other methods such
as knowledge distillation, quantization, and connection pruning can increase
the degree of compression for the proposed approach. We show the efficacy of
our proposed approach for classification and object detection tasks.
Related papers
- Group channel pruning and spatial attention distilling for object
detection [2.8675002818821542]
We introduce a three-stage model compression method: dynamic sparse training, group channel pruning, and spatial attention distilling.
Our method reduces the parameters of the model by 64.7 % and the calculation by 34.9%.
arXiv Detail & Related papers (2023-06-02T13:26:23Z) - Convolutional Neural Network Pruning with Structural Redundancy
Reduction [11.381864384054824]
We claim that identifying structural redundancy plays a more essential role than finding unimportant filters.
We propose a network pruning approach that identifies structural redundancy of a CNN and prunes filters in the selected layer(s) with the most redundancy.
arXiv Detail & Related papers (2021-04-08T00:16:24Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - Permute, Quantize, and Fine-tune: Efficient Compression of Neural
Networks [70.0243910593064]
Key to success of vector quantization is deciding which parameter groups should be compressed together.
In this paper we make the observation that the weights of two adjacent layers can be permuted while expressing the same function.
We then establish a connection to rate-distortion theory and search for permutations that result in networks that are easier to compress.
arXiv Detail & Related papers (2020-10-29T15:47:26Z) - SCOP: Scientific Control for Reliable Neural Network Pruning [127.20073865874636]
This paper proposes a reliable neural network pruning algorithm by setting up a scientific control.
Redundant filters can be discovered in the adversarial process of different features.
Our method can reduce 57.8% parameters and 60.2% FLOPs of ResNet-101 with only 0.01% top-1 accuracy loss on ImageNet.
arXiv Detail & Related papers (2020-10-21T03:02:01Z) - Comprehensive Online Network Pruning via Learnable Scaling Factors [3.274290296343038]
Deep CNNs can either be pruned width-wise by removing filters based on their importance or depth-wise by removing layers and blocks.
We propose a comprehensive pruning strategy that can perform both width-wise as well as depth-wise pruning.
arXiv Detail & Related papers (2020-10-06T11:04:17Z) - MTP: Multi-Task Pruning for Efficient Semantic Segmentation Networks [32.84644563020912]
We present a multi-task channel pruning approach for semantic segmentation networks.
The importance of each convolution filter wrt the channel of an arbitrary layer will be simultaneously determined by the classification and segmentation tasks.
Experimental results on several benchmarks illustrate the superiority of the proposed algorithm over the state-of-the-art pruning methods.
arXiv Detail & Related papers (2020-07-16T15:03:01Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z) - Group Sparsity: The Hinge Between Filter Pruning and Decomposition for
Network Compression [145.04742985050808]
We analyze two popular network compression techniques, i.e. filter pruning and low-rank decomposition, in a unified sense.
By changing the way the sparsity regularization is enforced, filter pruning and low-rank decomposition can be derived accordingly.
Our approach proves its potential as it compares favorably to the state-of-the-art on several benchmarks.
arXiv Detail & Related papers (2020-03-19T17:57:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.