Layer Pruning via Fusible Residual Convolutional Block for Deep Neural
Networks
- URL: http://arxiv.org/abs/2011.14356v1
- Date: Sun, 29 Nov 2020 12:51:16 GMT
- Title: Layer Pruning via Fusible Residual Convolutional Block for Deep Neural
Networks
- Authors: Pengtao Xu, Jian Cao, Fanhua Shang, Wenyu Sun, Pu Li
- Abstract summary: layer pruning has less inference time and runtime memory usage when the same FLOPs and number of parameters are pruned.
We propose a simple layer pruning method using residual convolutional block (ResConv)
Our pruning method achieves excellent performance of compression and acceleration over the state-thearts on different datasets.
- Score: 15.64167076052513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In order to deploy deep convolutional neural networks (CNNs) on
resource-limited devices, many model pruning methods for filters and weights
have been developed, while only a few to layer pruning. However, compared with
filter pruning and weight pruning, the compact model obtained by layer pruning
has less inference time and run-time memory usage when the same FLOPs and
number of parameters are pruned because of less data moving in memory. In this
paper, we propose a simple layer pruning method using fusible residual
convolutional block (ResConv), which is implemented by inserting shortcut
connection with a trainable information control parameter into a single
convolutional layer. Using ResConv structures in training can improve network
accuracy and train deep plain networks, and adds no additional computation
during inference process because ResConv is fused to be an ordinary
convolutional layer after training. For layer pruning, we convert convolutional
layers of network into ResConv with a layer scaling factor. In the training
process, the L1 regularization is adopted to make the scaling factors sparse,
so that unimportant layers are automatically identified and then removed,
resulting in a model of layer reduction. Our pruning method achieves excellent
performance of compression and acceleration over the state-of-the-arts on
different datasets, and needs no retraining in the case of low pruning rate.
For example, with ResNet-110, we achieve a 65.5%-FLOPs reduction by removing
55.5% of the parameters, with only a small loss of 0.13% in top-1 accuracy on
CIFAR-10.
Related papers
- LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from
Scratch [14.911305800463285]
We propose a novel framework named Layer Adaptive Progressive Pruning (LAPP)
LAPP designs an effective and efficient pruning strategy that introduces a learnable threshold for each layer and FLOPs constraints for network.
Our method demonstrates superior performance gains over previous compression methods on various datasets and backbone architectures.
arXiv Detail & Related papers (2023-09-25T14:08:45Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - Boosting Pruned Networks with Linear Over-parameterization [8.796518772724955]
Structured pruning compresses neural networks by reducing channels (filters) for fast inference and low footprint at run-time.
To restore accuracy after pruning, fine-tuning is usually applied to pruned networks.
We propose a novel method that first linearly over- parameterizes the compact layers in pruned networks to enlarge the number of fine-tuning parameters.
arXiv Detail & Related papers (2022-04-25T05:30:26Z) - End-to-End Sensitivity-Based Filter Pruning [49.61707925611295]
We present a sensitivity-based filter pruning algorithm (SbF-Pruner) to learn the importance scores of filters of each layer end-to-end.
Our method learns the scores from the filter weights, enabling it to account for the correlations between the filters of each layer.
arXiv Detail & Related papers (2022-04-15T10:21:05Z) - Compact representations of convolutional neural networks via weight
pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization.
We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z) - Basis Scaling and Double Pruning for Efficient Inference in
Network-Based Transfer Learning [1.3467579878240454]
We decompose a convolutional layer into two layers: a convolutional layer with the orthonormal basis vectors as the filters, and a "BasisScalingConv" layer which is responsible for rescaling the features.
We can achieve pruning ratios up to 74.6% for CIFAR-10 and 98.9% for MNIST in model parameters.
arXiv Detail & Related papers (2021-08-06T00:04:02Z) - Pruning Neural Networks with Interpolative Decompositions [5.377278489623063]
We introduce a principled approach to neural network pruning that casts the problem as a structured low-rank matrix approximation.
We demonstrate how to prune a neural network by first building a set of primitives to prune a single fully connected or convolution layer.
We achieve an accuracy of 93.62 $pm$ 0.36% using VGG-16 on CIFAR-10, with a 51% FLOPS reduction.
arXiv Detail & Related papers (2021-07-30T20:13:49Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - UCP: Uniform Channel Pruning for Deep Convolutional Neural Networks
Compression and Acceleration [24.42067007684169]
We propose a novel uniform channel pruning (UCP) method to prune deep CNN.
The unimportant channels, including convolutional kernels related to them, are pruned directly.
We verify our method on CIFAR-10, CIFAR-100 and ILSVRC-2012 for image classification.
arXiv Detail & Related papers (2020-10-03T01:51:06Z) - ResRep: Lossless CNN Pruning via Decoupling Remembering and Forgetting [105.97936163854693]
We propose ResRep, which slims down a CNN by reducing the width (number of output channels) of convolutional layers.
Inspired by the neurobiology research about the independence of remembering and forgetting, we propose to re- parameterize a CNN into the remembering parts and forgetting parts.
We equivalently merge the remembering and forgetting parts into the original architecture with narrower layers.
arXiv Detail & Related papers (2020-07-07T07:56:45Z) - DHP: Differentiable Meta Pruning via HyperNetworks [158.69345612783198]
This paper introduces a differentiable pruning method via hypernetworks for automatic network pruning.
Latent vectors control the output channels of the convolutional layers in the backbone network and act as a handle for the pruning of the layers.
Experiments are conducted on various networks for image classification, single image super-resolution, and denoising.
arXiv Detail & Related papers (2020-03-30T17:59:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.