Related papers: Network Automatic Pruning: Start NAP and Take a Nap

Network Automatic Pruning: Start NAP and Take a Nap

URL: http://arxiv.org/abs/2101.06608v1
Date: Sun, 17 Jan 2021 07:09:19 GMT
Title: Network Automatic Pruning: Start NAP and Take a Nap
Authors: Wenyuan Zeng, Yuwen Xiong, Raquel Urtasun
Abstract summary: We propose NAP, a unified and automatic pruning framework for both fine-grained and structured pruning. It can find out unimportant components of a network and automatically decide appropriate compression ratios for different layers. Despite its simpleness to use, NAP outperforms previous pruning methods by large margins.
Score: 94.14675930881366
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Network pruning can significantly reduce the computation and memory footprint of large neural networks. To achieve a good trade-off between model size and performance, popular pruning techniques usually rely on hand-crafted heuristics and require manually setting the compression ratio for each layer. This process is typically time-consuming and requires expert knowledge to achieve good results. In this paper, we propose NAP, a unified and automatic pruning framework for both fine-grained and structured pruning. It can find out unimportant components of a network and automatically decide appropriate compression ratios for different layers, based on a theoretically sound criterion. Towards this goal, NAP uses an efficient approximation of the Hessian for evaluating the importances of components, based on a Kronecker-factored Approximate Curvature method. Despite its simpleness to use, NAP outperforms previous pruning methods by large margins. For fine-grained pruning, NAP can compress AlexNet and VGG16 by 25x, and ResNet-50 by 6.7x without loss in accuracy on ImageNet. For structured pruning (e.g. channel pruning), it can reduce flops of VGG16 by 5.4x and ResNet-50 by 2.3x with only 1% accuracy drop. More importantly, this method is almost free from hyper-parameter tuning and requires no expert knowledge. You can start NAP and then take a nap!

Related papers

LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from Scratch [14.911305800463285]
We propose a novel framework named Layer Adaptive Progressive Pruning (LAPP) LAPP designs an effective and efficient pruning strategy that introduces a learnable threshold for each layer and FLOPs constraints for network. Our method demonstrates superior performance gains over previous compression methods on various datasets and backbone architectures.
arXiv Detail & Related papers (2023-09-25T14:08:45Z)
Neural Network Pruning by Cooperative Coevolution [16.0753044050118]
We propose a new filter pruning algorithm CCEP by cooperative coevolution. CCEP reduces the pruning space by a divide-and-conquer strategy. Experiments show that CCEP can achieve a competitive performance with the state-of-the-art pruning methods.
arXiv Detail & Related papers (2022-04-12T09:06:38Z)
Structured Pruning is All You Need for Pruning CNNs at Initialization [38.88730369884401]
Pruning is a popular technique for reducing the model size and computational cost of convolutional neural networks (CNNs) We propose PreCropping, a structured hardware-efficient model compression scheme. Compared to weight pruning, the proposed scheme is regular and dense in both storage and computation without sacrificing accuracy.
arXiv Detail & Related papers (2022-03-04T19:54:31Z)
MLPruning: A Multilevel Structured Pruning Framework for Transformer-based Models [78.45898846056303]
Pruning is an effective method to reduce the memory footprint and computational cost associated with large natural language processing models. We develop a novel MultiLevel structured Pruning framework, which uses three different levels of structured pruning: head pruning, row pruning, and block-wise sparse pruning.
arXiv Detail & Related papers (2021-05-30T22:00:44Z)
Dynamic Probabilistic Pruning: A general framework for hardware-constrained pruning at different granularities [80.06422693778141]
We propose a flexible new pruning mechanism that facilitates pruning at different granularities (weights, kernels, filters/feature maps) We refer to this algorithm as Dynamic Probabilistic Pruning (DPP) We show that DPP achieves competitive compression rates and classification accuracy when pruning common deep learning models trained on different benchmark datasets for image classification.
arXiv Detail & Related papers (2021-05-26T17:01:52Z)
Single-path Bit Sharing for Automatic Loss-aware Model Compression [126.98903867768732]
Single-path Bit Sharing (SBS) is able to significantly reduce computational cost while achieving promising performance. Our SBS compressed MobileNetV2 achieves 22.6x Bit-Operation (BOP) reduction with only 0.1% drop in the Top-1 accuracy.
arXiv Detail & Related papers (2021-01-13T08:28:21Z)
Rapid Structural Pruning of Neural Networks with Set-based Task-Adaptive Meta-Pruning [83.59005356327103]
A common limitation of most existing pruning techniques is that they require pre-training of the network at least once before pruning. We propose STAMP, which task-adaptively prunes a network pretrained on a large reference dataset by generating a pruning mask on it as a function of the target dataset. We validate STAMP against recent advanced pruning methods on benchmark datasets.
arXiv Detail & Related papers (2020-06-22T10:57:43Z)
Paying more attention to snapshots of Iterative Pruning: Improving Model Compression via Ensemble Distillation [4.254099382808598]
Existing methods often iteratively prune networks to attain high compression ratio without incurring significant loss in performance. We show that strong ensembles can be constructed from snapshots of iterative pruning, which achieve competitive performance and vary in network structure. In standard image classification benchmarks such as CIFAR and Tiny-Imagenet, we advance state-of-the-art pruning ratio of structured pruning by integrating simple l1-norm filters pruning into our pipeline.
arXiv Detail & Related papers (2020-06-20T03:59:46Z)
Learned Threshold Pruning [15.394473766381518]
Our method learns per-layer thresholds via gradient descent, unlike conventional methods where they are set as input. It takes $30$ epochs for tuning to prune ResNet50 on ImageNet by a factor of $9.1$. We also show that tuning effectively prunes modern textitcompactthreshold architectures such as EfficientNet, MobileNetV2 and MixNet.
arXiv Detail & Related papers (2020-02-28T21:32:39Z)
A "Network Pruning Network" Approach to Deep Model Compression [62.68120664998911]
We present a filter pruning approach for deep model compression using a multitask network. Our approach is based on learning a a pruner network to prune a pre-trained target network. The compressed model produced by our approach is generic and does not need any special hardware/software support.
arXiv Detail & Related papers (2020-01-15T20:38:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.