Network Automatic Pruning: Start NAP and Take a Nap
- URL: http://arxiv.org/abs/2101.06608v1
- Date: Sun, 17 Jan 2021 07:09:19 GMT
- Title: Network Automatic Pruning: Start NAP and Take a Nap
- Authors: Wenyuan Zeng, Yuwen Xiong, Raquel Urtasun
- Abstract summary: We propose NAP, a unified and automatic pruning framework for both fine-grained and structured pruning.
It can find out unimportant components of a network and automatically decide appropriate compression ratios for different layers.
Despite its simpleness to use, NAP outperforms previous pruning methods by large margins.
- Score: 94.14675930881366
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Network pruning can significantly reduce the computation and memory footprint
of large neural networks. To achieve a good trade-off between model size and
performance, popular pruning techniques usually rely on hand-crafted heuristics
and require manually setting the compression ratio for each layer. This process
is typically time-consuming and requires expert knowledge to achieve good
results. In this paper, we propose NAP, a unified and automatic pruning
framework for both fine-grained and structured pruning. It can find out
unimportant components of a network and automatically decide appropriate
compression ratios for different layers, based on a theoretically sound
criterion. Towards this goal, NAP uses an efficient approximation of the
Hessian for evaluating the importances of components, based on a
Kronecker-factored Approximate Curvature method. Despite its simpleness to use,
NAP outperforms previous pruning methods by large margins. For fine-grained
pruning, NAP can compress AlexNet and VGG16 by 25x, and ResNet-50 by 6.7x
without loss in accuracy on ImageNet. For structured pruning (e.g. channel
pruning), it can reduce flops of VGG16 by 5.4x and ResNet-50 by 2.3x with only
1% accuracy drop. More importantly, this method is almost free from
hyper-parameter tuning and requires no expert knowledge. You can start NAP and
then take a nap!
Related papers
- LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from
Scratch [14.911305800463285]
We propose a novel framework named Layer Adaptive Progressive Pruning (LAPP)
LAPP designs an effective and efficient pruning strategy that introduces a learnable threshold for each layer and FLOPs constraints for network.
Our method demonstrates superior performance gains over previous compression methods on various datasets and backbone architectures.
arXiv Detail & Related papers (2023-09-25T14:08:45Z) - Neural Network Pruning by Cooperative Coevolution [16.0753044050118]
We propose a new filter pruning algorithm CCEP by cooperative coevolution.
CCEP reduces the pruning space by a divide-and-conquer strategy.
Experiments show that CCEP can achieve a competitive performance with the state-of-the-art pruning methods.
arXiv Detail & Related papers (2022-04-12T09:06:38Z) - Structured Pruning is All You Need for Pruning CNNs at Initialization [38.88730369884401]
Pruning is a popular technique for reducing the model size and computational cost of convolutional neural networks (CNNs)
We propose PreCropping, a structured hardware-efficient model compression scheme.
Compared to weight pruning, the proposed scheme is regular and dense in both storage and computation without sacrificing accuracy.
arXiv Detail & Related papers (2022-03-04T19:54:31Z) - MLPruning: A Multilevel Structured Pruning Framework for
Transformer-based Models [78.45898846056303]
Pruning is an effective method to reduce the memory footprint and computational cost associated with large natural language processing models.
We develop a novel MultiLevel structured Pruning framework, which uses three different levels of structured pruning: head pruning, row pruning, and block-wise sparse pruning.
arXiv Detail & Related papers (2021-05-30T22:00:44Z) - Dynamic Probabilistic Pruning: A general framework for
hardware-constrained pruning at different granularities [80.06422693778141]
We propose a flexible new pruning mechanism that facilitates pruning at different granularities (weights, kernels, filters/feature maps)
We refer to this algorithm as Dynamic Probabilistic Pruning (DPP)
We show that DPP achieves competitive compression rates and classification accuracy when pruning common deep learning models trained on different benchmark datasets for image classification.
arXiv Detail & Related papers (2021-05-26T17:01:52Z) - Single-path Bit Sharing for Automatic Loss-aware Model Compression [126.98903867768732]
Single-path Bit Sharing (SBS) is able to significantly reduce computational cost while achieving promising performance.
Our SBS compressed MobileNetV2 achieves 22.6x Bit-Operation (BOP) reduction with only 0.1% drop in the Top-1 accuracy.
arXiv Detail & Related papers (2021-01-13T08:28:21Z) - Rapid Structural Pruning of Neural Networks with Set-based Task-Adaptive
Meta-Pruning [83.59005356327103]
A common limitation of most existing pruning techniques is that they require pre-training of the network at least once before pruning.
We propose STAMP, which task-adaptively prunes a network pretrained on a large reference dataset by generating a pruning mask on it as a function of the target dataset.
We validate STAMP against recent advanced pruning methods on benchmark datasets.
arXiv Detail & Related papers (2020-06-22T10:57:43Z) - Paying more attention to snapshots of Iterative Pruning: Improving Model
Compression via Ensemble Distillation [4.254099382808598]
Existing methods often iteratively prune networks to attain high compression ratio without incurring significant loss in performance.
We show that strong ensembles can be constructed from snapshots of iterative pruning, which achieve competitive performance and vary in network structure.
In standard image classification benchmarks such as CIFAR and Tiny-Imagenet, we advance state-of-the-art pruning ratio of structured pruning by integrating simple l1-norm filters pruning into our pipeline.
arXiv Detail & Related papers (2020-06-20T03:59:46Z) - Learned Threshold Pruning [15.394473766381518]
Our method learns per-layer thresholds via gradient descent, unlike conventional methods where they are set as input.
It takes $30$ epochs for tuning to prune ResNet50 on ImageNet by a factor of $9.1$.
We also show that tuning effectively prunes modern textitcompactthreshold architectures such as EfficientNet, MobileNetV2 and MixNet.
arXiv Detail & Related papers (2020-02-28T21:32:39Z) - A "Network Pruning Network" Approach to Deep Model Compression [62.68120664998911]
We present a filter pruning approach for deep model compression using a multitask network.
Our approach is based on learning a a pruner network to prune a pre-trained target network.
The compressed model produced by our approach is generic and does not need any special hardware/software support.
arXiv Detail & Related papers (2020-01-15T20:38:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.