Carrying out CNN Channel Pruning in a White Box
- URL: http://arxiv.org/abs/2104.11883v1
- Date: Sat, 24 Apr 2021 04:59:03 GMT
- Title: Carrying out CNN Channel Pruning in a White Box
- Authors: Yuxin Zhang, Mingbao Lin, Chia-Wen Lin, Jie Chen, Feiyue Huang,
Yongjian Wu, Yonghong Tian, Rongrong Ji
- Abstract summary: We conduct channel pruning in a white box.
To model the contribution of each channel to differentiating categories, we develop a class-wise mask for each channel.
It is the first time that CNN interpretability theory is considered to guide channel pruning.
- Score: 121.97098626458886
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Channel Pruning has been long adopted for compressing CNNs, which
significantly reduces the overall computation. Prior works implement channel
pruning in an unexplainable manner, which tends to reduce the final
classification errors while failing to consider the internal influence of each
channel. In this paper, we conduct channel pruning in a white box. Through deep
visualization of feature maps activated by different channels, we observe that
different channels have a varying contribution to different categories in image
classification. Inspired by this, we choose to preserve channels contributing
to most categories. Specifically, to model the contribution of each channel to
differentiating categories, we develop a class-wise mask for each channel,
implemented in a dynamic training manner w.r.t. the input image's category. On
the basis of the learned class-wise mask, we perform a global voting mechanism
to remove channels with less category discrimination. Lastly, a fine-tuning
process is conducted to recover the performance of the pruned model. To our
best knowledge, it is the first time that CNN interpretability theory is
considered to guide channel pruning. Extensive experiments demonstrate the
superiority of our White-Box over many state-of-the-arts. For instance, on
CIFAR-10, it reduces 65.23% FLOPs with even 0.62% accuracy improvement for
ResNet-110. On ILSVRC-2012, White-Box achieves a 45.6% FLOPs reduction with
only a small loss of 0.83% in the top-1 accuracy for ResNet-50. Code, training
logs and pruned models are anonymously at https://github.com/zyxxmu/White-Box.
Related papers
- Pruning Very Deep Neural Network Channels for Efficient Inference [6.497816402045099]
Given a trained CNN model, we propose an iterative two-step algorithm to effectively prune each layer.
VGG-16 achieves the state-of-the-art results by 5x speed-up along with only 0.3% increase of error.
Our method is able to accelerate modern networks like ResNet, Xception and suffers only 1.4%, 1.0% accuracy loss under 2x speed-up respectively.
arXiv Detail & Related papers (2022-11-14T06:48:33Z) - CHEX: CHannel EXploration for CNN Model Compression [47.3520447163165]
We propose a novel Channel Exploration methodology, dubbed as CHEX, to rectify these problems.
CheX repeatedly prunes and regrows the channels throughout the training process, which reduces the risk of pruning important channels prematurely.
Results demonstrate that CHEX can effectively reduce the FLOPs of diverse CNN architectures on a variety of computer vision tasks.
arXiv Detail & Related papers (2022-03-29T17:52:41Z) - Federated Unlearning via Class-Discriminative Pruning [16.657364988432317]
We propose a method for scrubbing the model clean of information about particular categories.
The method does not require retraining from scratch, nor global access to the data used for training.
Channel pruning is followed by a fine-tuning process to recover the performance of the pruned model.
arXiv Detail & Related papers (2021-10-22T14:01:42Z) - AdaPruner: Adaptive Channel Pruning and Effective Weights Inheritance [9.3421559369389]
We propose a pruning framework that adaptively determines the number of each layer's channels as well as the wights inheritance criteria for sub-network.
AdaPruner allows to obtain pruned network quickly, accurately and efficiently.
On ImageNet, we reduce 32.8% FLOPs of MobileNetV2 with only 0.62% decrease for top-1 accuracy, which exceeds all previous state-of-the-art channel pruning methods.
arXiv Detail & Related papers (2021-09-14T01:52:05Z) - BWCP: Probabilistic Learning-to-Prune Channels for ConvNets via Batch
Whitening [63.081808698068365]
This work presents a probabilistic channel pruning method to accelerate Convolutional Neural Networks (CNNs)
Previous pruning methods often zero out unimportant channels in training in a deterministic manner, which reduces CNN's learning capacity and results in suboptimal performance.
We develop a probability-based pruning algorithm, called batch whitening channel pruning (BWCP), which canally discard unimportant channels by modeling the probability of a channel being activated.
arXiv Detail & Related papers (2021-05-13T17:00:05Z) - Channel-wise Knowledge Distillation for Dense Prediction [73.99057249472735]
We propose to align features channel-wise between the student and teacher networks.
We consistently achieve superior performance on three benchmarks with various network structures.
arXiv Detail & Related papers (2020-11-26T12:00:38Z) - Channel Equilibrium Networks for Learning Deep Representation [63.76618960820138]
This work shows that the combination of normalization and rectified linear function leads to inhibited channels.
Unlike prior arts that simply removed the inhibited channels, we propose to "wake them up" during training by designing a novel neural building block.
Channel Equilibrium (CE) block enables channels at the same layer to contribute equally to the learned representation.
arXiv Detail & Related papers (2020-02-29T09:02:31Z) - Discrimination-aware Network Pruning for Deep Model Compression [79.44318503847136]
Existing pruning methods either train from scratch with sparsity constraints or minimize the reconstruction error between the feature maps of the pre-trained models and the compressed ones.
We propose a simple-yet-effective method called discrimination-aware channel pruning (DCP) to choose the channels that actually contribute to the discriminative power.
Experiments on both image classification and face recognition demonstrate the effectiveness of our methods.
arXiv Detail & Related papers (2020-01-04T07:07:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.