Automatic Neural Network Pruning that Efficiently Preserves the Model
Accuracy
- URL: http://arxiv.org/abs/2111.09635v1
- Date: Thu, 18 Nov 2021 11:29:35 GMT
- Title: Automatic Neural Network Pruning that Efficiently Preserves the Model
Accuracy
- Authors: Thibault Castells and Seul-Ki Yeom
- Abstract summary: pruning filters is a common solution, but most existing pruning methods do not preserve the model accuracy efficiently.
We propose an automatic pruning method that learns which neurons to preserve in order to maintain the model accuracy while reducing the FLOPs to a predefined target.
We achieve a 52.00% FLOPs reduction on ResNet-50, with a Top-1 accuracy of 47.51% after pruning and a state-of-the-art (SOTA) accuracy of 76.63% after finetuning.
- Score: 2.538209532048867
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural networks performance has been significantly improved in the last few
years, at the cost of an increasing number of floating point operations per
second (FLOPs). However, more FLOPs can be an issue when computational
resources are limited. As an attempt to solve this problem, pruning filters is
a common solution, but most existing pruning methods do not preserve the model
accuracy efficiently and therefore require a large number of finetuning epochs.
In this paper, we propose an automatic pruning method that learns which neurons
to preserve in order to maintain the model accuracy while reducing the FLOPs to
a predefined target. To accomplish this task, we introduce a trainable
bottleneck that only requires one single epoch with 25.6% (CIFAR-10) or 7.49%
(ILSVRC2012) of the dataset to learn which filters to prune. Experiments on
various architectures and datasets show that the proposed method can not only
preserve the accuracy after pruning but also outperform existing methods after
finetuning. We achieve a 52.00% FLOPs reduction on ResNet-50, with a Top-1
accuracy of 47.51% after pruning and a state-of-the-art (SOTA) accuracy of
76.63% after finetuning on ILSVRC2012. Code is available at (link anonymized
for review).
Related papers
- RL-Pruner: Structured Pruning Using Reinforcement Learning for CNN Compression and Acceleration [0.0]
We propose RL-Pruner, which uses reinforcement learning to learn the optimal pruning distribution.
RL-Pruner can automatically extract dependencies between filters in the input model and perform pruning, without requiring model-specific pruning implementations.
arXiv Detail & Related papers (2024-11-10T13:35:10Z) - Instant Complexity Reduction in CNNs using Locality-Sensitive Hashing [50.79602839359522]
We propose HASTE (Hashing for Tractable Efficiency), a parameter-free and data-free module that acts as a plug-and-play replacement for any regular convolution module.
We are able to drastically compress latent feature maps without sacrificing much accuracy by using locality-sensitive hashing (LSH)
In particular, we are able to instantly drop 46.72% of FLOPs while only losing 1.25% accuracy by just swapping the convolution modules in a ResNet34 on CIFAR-10 for our HASTE module.
arXiv Detail & Related papers (2023-09-29T13:09:40Z) - Gradient-Free Structured Pruning with Unlabeled Data [57.999191898036706]
We propose a gradient-free structured pruning framework that uses only unlabeled data.
Up to 40% of the original FLOP count can be reduced with less than a 4% accuracy loss across all tasks considered.
arXiv Detail & Related papers (2023-03-07T19:12:31Z) - Pruning On-the-Fly: A Recoverable Pruning Method without Fine-tuning [12.90416661059601]
We propose a retraining-free pruning method based on hyperspherical learning and loss penalty terms.
The proposed loss penalty term pushes some of the model weights far from zero, while the rest weight values are pushed near zero.
Our proposed method can instantly recover the accuracy of a pruned model by replacing the pruned values with their mean value.
arXiv Detail & Related papers (2022-12-24T04:33:03Z) - Non-Parametric Adaptive Network Pruning [125.4414216272874]
We introduce non-parametric modeling to simplify the algorithm design.
Inspired by the face recognition community, we use a message passing algorithm to obtain an adaptive number of exemplars.
EPruner breaks the dependency on the training data in determining the "important" filters.
arXiv Detail & Related papers (2021-01-20T06:18:38Z) - ACP: Automatic Channel Pruning via Clustering and Swarm Intelligence
Optimization for CNN [6.662639002101124]
convolutional neural network (CNN) gets deeper and wider in recent years.
Existing magnitude-based pruning methods are efficient, but the performance of the compressed network is unpredictable.
We propose a novel automatic channel pruning method (ACP)
ACP is evaluated against several state-of-the-art CNNs on three different classification datasets.
arXiv Detail & Related papers (2021-01-16T08:56:38Z) - ResRep: Lossless CNN Pruning via Decoupling Remembering and Forgetting [105.97936163854693]
We propose ResRep, which slims down a CNN by reducing the width (number of output channels) of convolutional layers.
Inspired by the neurobiology research about the independence of remembering and forgetting, we propose to re- parameterize a CNN into the remembering parts and forgetting parts.
We equivalently merge the remembering and forgetting parts into the original architecture with narrower layers.
arXiv Detail & Related papers (2020-07-07T07:56:45Z) - EagleEye: Fast Sub-net Evaluation for Efficient Neural Network Pruning [82.54669314604097]
EagleEye is a simple yet efficient evaluation component based on adaptive batch normalization.
It unveils a strong correlation between different pruned structures and their final settled accuracy.
This module is also general to plug-in and improve some existing pruning algorithms.
arXiv Detail & Related papers (2020-07-06T01:32:31Z) - Filter Sketch for Network Pruning [184.41079868885265]
We propose a novel network pruning approach by information preserving of pre-trained network weights (filters)
Our approach, referred to as FilterSketch, encodes the second-order information of pre-trained weights.
Experiments on CIFAR-10 show that FilterSketch reduces 63.3% of FLOPs and prunes 59.9% of network parameters with negligible accuracy cost.
arXiv Detail & Related papers (2020-01-23T13:57:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.