Pruning Very Deep Neural Network Channels for Efficient Inference
- URL: http://arxiv.org/abs/2211.08339v1
- Date: Mon, 14 Nov 2022 06:48:33 GMT
- Title: Pruning Very Deep Neural Network Channels for Efficient Inference
- Authors: Yihui He
- Abstract summary: Given a trained CNN model, we propose an iterative two-step algorithm to effectively prune each layer.
VGG-16 achieves the state-of-the-art results by 5x speed-up along with only 0.3% increase of error.
Our method is able to accelerate modern networks like ResNet, Xception and suffers only 1.4%, 1.0% accuracy loss under 2x speed-up respectively.
- Score: 6.497816402045099
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce a new channel pruning method to accelerate very
deep convolutional neural networks. Given a trained CNN model, we propose an
iterative two-step algorithm to effectively prune each layer, by a LASSO
regression based channel selection and least square reconstruction. We further
generalize this algorithm to multi-layer and multi-branch cases. Our method
reduces the accumulated error and enhances the compatibility with various
architectures. Our pruned VGG-16 achieves the state-of-the-art results by 5x
speed-up along with only 0.3% increase of error. More importantly, our method
is able to accelerate modern networks like ResNet, Xception and suffers only
1.4%, 1.0% accuracy loss under 2x speed-up respectively, which is significant.
Our code has been made publicly available.
Related papers
- CHEX: CHannel EXploration for CNN Model Compression [47.3520447163165]
We propose a novel Channel Exploration methodology, dubbed as CHEX, to rectify these problems.
CheX repeatedly prunes and regrows the channels throughout the training process, which reduces the risk of pruning important channels prematurely.
Results demonstrate that CHEX can effectively reduce the FLOPs of diverse CNN architectures on a variety of computer vision tasks.
arXiv Detail & Related papers (2022-03-29T17:52:41Z) - AdaPruner: Adaptive Channel Pruning and Effective Weights Inheritance [9.3421559369389]
We propose a pruning framework that adaptively determines the number of each layer's channels as well as the wights inheritance criteria for sub-network.
AdaPruner allows to obtain pruned network quickly, accurately and efficiently.
On ImageNet, we reduce 32.8% FLOPs of MobileNetV2 with only 0.62% decrease for top-1 accuracy, which exceeds all previous state-of-the-art channel pruning methods.
arXiv Detail & Related papers (2021-09-14T01:52:05Z) - Group Fisher Pruning for Practical Network Compression [58.25776612812883]
We present a general channel pruning approach that can be applied to various complicated structures.
We derive a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels.
Our method can be used to prune any structures including those with coupled channels.
arXiv Detail & Related papers (2021-08-02T08:21:44Z) - BWCP: Probabilistic Learning-to-Prune Channels for ConvNets via Batch
Whitening [63.081808698068365]
This work presents a probabilistic channel pruning method to accelerate Convolutional Neural Networks (CNNs)
Previous pruning methods often zero out unimportant channels in training in a deterministic manner, which reduces CNN's learning capacity and results in suboptimal performance.
We develop a probability-based pruning algorithm, called batch whitening channel pruning (BWCP), which canally discard unimportant channels by modeling the probability of a channel being activated.
arXiv Detail & Related papers (2021-05-13T17:00:05Z) - ACP: Automatic Channel Pruning via Clustering and Swarm Intelligence
Optimization for CNN [6.662639002101124]
convolutional neural network (CNN) gets deeper and wider in recent years.
Existing magnitude-based pruning methods are efficient, but the performance of the compressed network is unpredictable.
We propose a novel automatic channel pruning method (ACP)
ACP is evaluated against several state-of-the-art CNNs on three different classification datasets.
arXiv Detail & Related papers (2021-01-16T08:56:38Z) - AutoPruning for Deep Neural Network with Dynamic Channel Masking [28.018077874687343]
We propose a learning based auto pruning algorithm for deep neural network.
A two objectives' problem that aims for the the weights and the best channels for each layer is first formulated.
An alternative optimization approach is then proposed to derive the optimal channel numbers and weights simultaneously.
arXiv Detail & Related papers (2020-10-22T20:12:46Z) - UCP: Uniform Channel Pruning for Deep Convolutional Neural Networks
Compression and Acceleration [24.42067007684169]
We propose a novel uniform channel pruning (UCP) method to prune deep CNN.
The unimportant channels, including convolutional kernels related to them, are pruned directly.
We verify our method on CIFAR-10, CIFAR-100 and ILSVRC-2012 for image classification.
arXiv Detail & Related papers (2020-10-03T01:51:06Z) - Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization.
Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z) - Discrimination-aware Network Pruning for Deep Model Compression [79.44318503847136]
Existing pruning methods either train from scratch with sparsity constraints or minimize the reconstruction error between the feature maps of the pre-trained models and the compressed ones.
We propose a simple-yet-effective method called discrimination-aware channel pruning (DCP) to choose the channels that actually contribute to the discriminative power.
Experiments on both image classification and face recognition demonstrate the effectiveness of our methods.
arXiv Detail & Related papers (2020-01-04T07:07:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.