Pruning by Active Attention Manipulation
- URL: http://arxiv.org/abs/2210.11114v1
- Date: Thu, 20 Oct 2022 09:17:02 GMT
- Title: Pruning by Active Attention Manipulation
- Authors: Zahra Babaiee, Lucas Liebenwein, Ramin Hasani, Daniela Rus, Radu Grosu
- Abstract summary: Filter pruning of a CNN is typically achieved by applying discrete masks on the CNN's filter weights or activation maps, post-training.
Here, we present a new filter-importance-scoring concept named pruning by active attention manipulation (PAAM)
PAAM learns analog filter scores from the filter weights by optimizing a cost function regularized by an additive term in the scores.
- Score: 49.61707925611295
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Filter pruning of a CNN is typically achieved by applying discrete masks on
the CNN's filter weights or activation maps, post-training. Here, we present a
new filter-importance-scoring concept named pruning by active attention
manipulation (PAAM), that sparsifies the CNN's set of filters through a
particular attention mechanism, during-training. PAAM learns analog filter
scores from the filter weights by optimizing a cost function regularized by an
additive term in the scores. As the filters are not independent, we use
attention to dynamically learn their correlations. Moreover, by training the
pruning scores of all layers simultaneously, PAAM can account for layer
inter-dependencies, which is essential to finding a performant sparse
sub-network. PAAM can also train and generate a pruned network from scratch in
a straightforward, one-stage training process without requiring a pre-trained
network. Finally, PAAM does not need layer-specific hyperparameters and
pre-defined layer budgets, since it can implicitly determine the appropriate
number of filters in each layer. Our experimental results on different network
architectures suggest that PAAM outperforms state-of-the-art structured-pruning
methods (SOTA). On CIFAR-10 dataset, without requiring a pre-trained baseline
network, we obtain 1.02% and 1.19% accuracy gain and 52.3% and 54% parameters
reduction, on ResNet56 and ResNet110, respectively. Similarly, on the ImageNet
dataset, PAAM achieves 1.06% accuracy gain while pruning 51.1% of the
parameters on ResNet50. For Cifar-10, this is better than the SOTA with a
margin of 9.5% and 6.6%, respectively, and on ImageNet with a margin of 11%.
Related papers
- Structured Network Pruning by Measuring Filter-wise Interactions [6.037167142826297]
We propose a structured network pruning approach SNPFI (Structured Network Pruning by measuring Filter-wise Interaction)
During the pruning, the SNPFI can automatically assign the proper sparsity based on the filter utilization strength.
We empirically demonstrate the effectiveness of the SNPFI with several commonly used CNN models.
arXiv Detail & Related papers (2023-07-03T05:26:05Z) - End-to-End Sensitivity-Based Filter Pruning [49.61707925611295]
We present a sensitivity-based filter pruning algorithm (SbF-Pruner) to learn the importance scores of filters of each layer end-to-end.
Our method learns the scores from the filter weights, enabling it to account for the correlations between the filters of each layer.
arXiv Detail & Related papers (2022-04-15T10:21:05Z) - Pruning Networks with Cross-Layer Ranking & k-Reciprocal Nearest Filters [151.2423480789271]
A novel pruning method, termed CLR-RNF, is proposed for filter-level network pruning.
We conduct image classification on CIFAR-10 and ImageNet to demonstrate the superiority of our CLR-RNF over the state-of-the-arts.
arXiv Detail & Related papers (2022-02-15T04:53:24Z) - Batch Normalization Tells You Which Filter is Important [49.903610684578716]
We propose a simple yet effective filter pruning method by evaluating the importance of each filter based on the BN parameters of pre-trained CNNs.
The experimental results on CIFAR-10 and ImageNet demonstrate that the proposed method can achieve outstanding performance.
arXiv Detail & Related papers (2021-12-02T12:04:59Z) - Training Compact CNNs for Image Classification using Dynamic-coded
Filter Fusion [139.71852076031962]
We present a novel filter pruning method, dubbed dynamic-coded filter fusion (DCFF)
We derive compact CNNs in a computation-economical and regularization-free manner for efficient image classification.
Our DCFF derives a compact VGGNet-16 with only 72.77M FLOPs and 1.06M parameters while reaching top-1 accuracy of 93.47%.
arXiv Detail & Related papers (2021-07-14T18:07:38Z) - Data Agnostic Filter Gating for Efficient Deep Networks [72.4615632234314]
Current filter pruning methods mainly leverage feature maps to generate important scores for filters and prune those with smaller scores.
In this paper, we propose a data filter pruning method that uses an auxiliary network named Dagger module to induce pruning.
In addition, to help prune filters with certain FLOPs constraints, we leverage an explicit FLOPs-aware regularization to directly promote pruning filters toward target FLOPs.
arXiv Detail & Related papers (2020-10-28T15:26:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.