HRel: Filter Pruning based on High Relevance between Activation Maps and
Class Labels
- URL: http://arxiv.org/abs/2202.10716v1
- Date: Tue, 22 Feb 2022 08:12:22 GMT
- Title: HRel: Filter Pruning based on High Relevance between Activation Maps and
Class Labels
- Authors: CH Sarvani, Mrinmoy Ghorai, Shiv Ram Dubey, SH Shabbeer Basha
- Abstract summary: This paper proposes an Information Bottleneck theory based filter pruning method that uses a statistical measure called Mutual Information (MI)
Unlike the existing MI based pruning methods, the proposed method determines the significance of the filters purely based on their corresponding activation map's relationship with the class labels.
The proposed method shows the state-of-the-art pruning results for LeNet-5, VGG-16, ResNet-56, ResNet-110 and ResNet-50 architectures.
- Score: 11.409989603679614
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper proposes an Information Bottleneck theory based filter pruning
method that uses a statistical measure called Mutual Information (MI). The MI
between filters and class labels, also called \textit{Relevance}, is computed
using the filter's activation maps and the annotations. The filters having High
Relevance (HRel) are considered to be more important. Consequently, the least
important filters, which have lower Mutual Information with the class labels,
are pruned. Unlike the existing MI based pruning methods, the proposed method
determines the significance of the filters purely based on their corresponding
activation map's relationship with the class labels. Architectures such as
LeNet-5, VGG-16, ResNet-56\textcolor{myblue}{, ResNet-110 and ResNet-50 are
utilized to demonstrate the efficacy of the proposed pruning method over MNIST,
CIFAR-10 and ImageNet datasets. The proposed method shows the state-of-the-art
pruning results for LeNet-5, VGG-16, ResNet-56, ResNet-110 and ResNet-50
architectures. In the experiments, we prune 97.98 \%, 84.85 \%, 76.89\%,
76.95\%, and 63.99\% of Floating Point Operation (FLOP)s from LeNet-5, VGG-16,
ResNet-56, ResNet-110, and ResNet-50 respectively.} The proposed HRel pruning
method outperforms recent state-of-the-art filter pruning methods. Even after
pruning the filters from convolutional layers of LeNet-5 drastically (i.e. from
20, 50 to 2, 3, respectively), only a small accuracy drop of 0.52\% is
observed. Notably, for VGG-16, 94.98\% parameters are reduced, only with a drop
of 0.36\% in top-1 accuracy. \textcolor{myblue}{ResNet-50 has shown a 1.17\%
drop in the top-5 accuracy after pruning 66.42\% of the FLOPs.} In addition to
pruning, the Information Plane dynamics of Information Bottleneck theory is
analyzed for various Convolutional Neural Network architectures with the effect
of pruning.
Related papers
- Pruning by Active Attention Manipulation [49.61707925611295]
Filter pruning of a CNN is typically achieved by applying discrete masks on the CNN's filter weights or activation maps, post-training.
Here, we present a new filter-importance-scoring concept named pruning by active attention manipulation (PAAM)
PAAM learns analog filter scores from the filter weights by optimizing a cost function regularized by an additive term in the scores.
arXiv Detail & Related papers (2022-10-20T09:17:02Z) - End-to-End Sensitivity-Based Filter Pruning [49.61707925611295]
We present a sensitivity-based filter pruning algorithm (SbF-Pruner) to learn the importance scores of filters of each layer end-to-end.
Our method learns the scores from the filter weights, enabling it to account for the correlations between the filters of each layer.
arXiv Detail & Related papers (2022-04-15T10:21:05Z) - Pruning Networks with Cross-Layer Ranking & k-Reciprocal Nearest Filters [151.2423480789271]
A novel pruning method, termed CLR-RNF, is proposed for filter-level network pruning.
We conduct image classification on CIFAR-10 and ImageNet to demonstrate the superiority of our CLR-RNF over the state-of-the-arts.
arXiv Detail & Related papers (2022-02-15T04:53:24Z) - CHIP: CHannel Independence-based Pruning for Compact Neural Networks [13.868303041084431]
Filter pruning has been widely used for neural network compression because of its enabled practical acceleration.
We propose to perform efficient filter pruning using Channel Independence, a metric that measures the correlations among different feature maps.
arXiv Detail & Related papers (2021-10-26T19:35:56Z) - Training Compact CNNs for Image Classification using Dynamic-coded
Filter Fusion [139.71852076031962]
We present a novel filter pruning method, dubbed dynamic-coded filter fusion (DCFF)
We derive compact CNNs in a computation-economical and regularization-free manner for efficient image classification.
Our DCFF derives a compact VGGNet-16 with only 72.77M FLOPs and 1.06M parameters while reaching top-1 accuracy of 93.47%.
arXiv Detail & Related papers (2021-07-14T18:07:38Z) - Deep Model Compression based on the Training History [13.916984628784768]
We propose a novel History Based Filter Pruning (HBFP) method that utilizes network training history for filter pruning.
The proposed pruning method outperforms the state-of-the-art in terms of FLOPs reduction (floating-point operations) by 97.98%, 83.42%, 78.43%, and 74.95% for LeNet-5, VGG-16, ResNet-56, and ResNet-110 models, respectively.
arXiv Detail & Related papers (2021-01-30T06:04:21Z) - Data Agnostic Filter Gating for Efficient Deep Networks [72.4615632234314]
Current filter pruning methods mainly leverage feature maps to generate important scores for filters and prune those with smaller scores.
In this paper, we propose a data filter pruning method that uses an auxiliary network named Dagger module to induce pruning.
In addition, to help prune filters with certain FLOPs constraints, we leverage an explicit FLOPs-aware regularization to directly promote pruning filters toward target FLOPs.
arXiv Detail & Related papers (2020-10-28T15:26:40Z) - Filter Sketch for Network Pruning [184.41079868885265]
We propose a novel network pruning approach by information preserving of pre-trained network weights (filters)
Our approach, referred to as FilterSketch, encodes the second-order information of pre-trained weights.
Experiments on CIFAR-10 show that FilterSketch reduces 63.3% of FLOPs and prunes 59.9% of network parameters with negligible accuracy cost.
arXiv Detail & Related papers (2020-01-23T13:57:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.