Related papers: LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from Scratch

LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from Scratch

URL: http://arxiv.org/abs/2309.14157v1
Date: Mon, 25 Sep 2023 14:08:45 GMT
Title: LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from Scratch
Authors: Pucheng Zhai, Kailing Guo, Fang Liu, Xiaofen Xing, Xiangmin Xu
Abstract summary: We propose a novel framework named Layer Adaptive Progressive Pruning (LAPP) LAPP designs an effective and efficient pruning strategy that introduces a learnable threshold for each layer and FLOPs constraints for network. Our method demonstrates superior performance gains over previous compression methods on various datasets and backbone architectures.
Score: 14.911305800463285
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Structured pruning is a commonly used convolutional neural network (CNN) compression approach. Pruning rate setting is a fundamental problem in structured pruning. Most existing works introduce too many additional learnable parameters to assign different pruning rates across different layers in CNN or cannot control the compression rate explicitly. Since too narrow network blocks information flow for training, automatic pruning rate setting cannot explore a high pruning rate for a specific layer. To overcome these limitations, we propose a novel framework named Layer Adaptive Progressive Pruning (LAPP), which gradually compresses the network during initial training of a few epochs from scratch. In particular, LAPP designs an effective and efficient pruning strategy that introduces a learnable threshold for each layer and FLOPs constraints for network. Guided by both task loss and FLOPs constraints, the learnable thresholds are dynamically and gradually updated to accommodate changes of importance scores during training. Therefore the pruning strategy can gradually prune the network and automatically determine the appropriate pruning rates for each layer. What's more, in order to maintain the expressive power of the pruned layer, before training starts, we introduce an additional lightweight bypass for each convolutional layer to be pruned, which only adds relatively few additional burdens. Our method demonstrates superior performance gains over previous compression methods on various datasets and backbone architectures. For example, on CIFAR-10, our method compresses ResNet-20 to 40.3% without accuracy drop. 55.6% of FLOPs of ResNet-18 are reduced with 0.21% top-1 accuracy increase and 0.40% top-5 accuracy increase on ImageNet.

Related papers

Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures. This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead. We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z)
Learning a Consensus Sub-Network with Polarization Regularization and One Pass Training [3.2214522506924093]
Pruning schemes create extra overhead either by iterative training and fine-tuning for static pruning or repeated computation of a dynamic pruning graph. We propose a new parameter pruning strategy for learning a lighter-weight sub-network that minimizes the energy cost while maintaining comparable performance to the fully parameterised network on given downstream tasks. Our results on CIFAR-10 and CIFAR-100 suggest that our scheme can remove 50% of connections in deep networks with less than 1% reduction in classification accuracy.
arXiv Detail & Related papers (2023-02-17T09:37:17Z)
Boosting Pruned Networks with Linear Over-parameterization [8.796518772724955]
Structured pruning compresses neural networks by reducing channels (filters) for fast inference and low footprint at run-time. To restore accuracy after pruning, fine-tuning is usually applied to pruned networks. We propose a novel method that first linearly over- parameterizes the compact layers in pruned networks to enlarge the number of fine-tuning parameters.
arXiv Detail & Related papers (2022-04-25T05:30:26Z)
End-to-End Sensitivity-Based Filter Pruning [49.61707925611295]
We present a sensitivity-based filter pruning algorithm (SbF-Pruner) to learn the importance scores of filters of each layer end-to-end. Our method learns the scores from the filter weights, enabling it to account for the correlations between the filters of each layer.
arXiv Detail & Related papers (2022-04-15T10:21:05Z)
Basis Scaling and Double Pruning for Efficient Inference in Network-Based Transfer Learning [1.3467579878240454]
We decompose a convolutional layer into two layers: a convolutional layer with the orthonormal basis vectors as the filters, and a "BasisScalingConv" layer which is responsible for rescaling the features. We can achieve pruning ratios up to 74.6% for CIFAR-10 and 98.9% for MNIST in model parameters.
arXiv Detail & Related papers (2021-08-06T00:04:02Z)
Dynamic Probabilistic Pruning: A general framework for hardware-constrained pruning at different granularities [80.06422693778141]
We propose a flexible new pruning mechanism that facilitates pruning at different granularities (weights, kernels, filters/feature maps) We refer to this algorithm as Dynamic Probabilistic Pruning (DPP) We show that DPP achieves competitive compression rates and classification accuracy when pruning common deep learning models trained on different benchmark datasets for image classification.
arXiv Detail & Related papers (2021-05-26T17:01:52Z)
Layer Pruning via Fusible Residual Convolutional Block for Deep Neural Networks [15.64167076052513]
layer pruning has less inference time and runtime memory usage when the same FLOPs and number of parameters are pruned. We propose a simple layer pruning method using residual convolutional block (ResConv) Our pruning method achieves excellent performance of compression and acceleration over the state-thearts on different datasets.
arXiv Detail & Related papers (2020-11-29T12:51:16Z)
Rapid Structural Pruning of Neural Networks with Set-based Task-Adaptive Meta-Pruning [83.59005356327103]
A common limitation of most existing pruning techniques is that they require pre-training of the network at least once before pruning. We propose STAMP, which task-adaptively prunes a network pretrained on a large reference dataset by generating a pruning mask on it as a function of the target dataset. We validate STAMP against recent advanced pruning methods on benchmark datasets.
arXiv Detail & Related papers (2020-06-22T10:57:43Z)
Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs. Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
DHP: Differentiable Meta Pruning via HyperNetworks [158.69345612783198]
This paper introduces a differentiable pruning method via hypernetworks for automatic network pruning. Latent vectors control the output channels of the convolutional layers in the backbone network and act as a handle for the pruning of the layers. Experiments are conducted on various networks for image classification, single image super-resolution, and denoising.
arXiv Detail & Related papers (2020-03-30T17:59:18Z)
Gradual Channel Pruning while Training using Feature Relevance Scores for Convolutional Neural Networks [6.534515590778012]
Pruning is one of the predominant approaches used for deep network compression. We present a simple-yet-effective gradual channel pruning while training methodology using a novel data-driven metric. We demonstrate the effectiveness of the proposed methodology on architectures such as VGG and ResNet.
arXiv Detail & Related papers (2020-02-23T17:56:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.