Related papers: Towards Optimal Filter Pruning with Balanced Performance and Pruning Speed

Towards Optimal Filter Pruning with Balanced Performance and Pruning Speed

URL: http://arxiv.org/abs/2010.06821v1
Date: Wed, 14 Oct 2020 06:17:09 GMT
Title: Towards Optimal Filter Pruning with Balanced Performance and Pruning Speed
Authors: Dong Li, Sitong Chen, Xudong Liu, Yunda Sun and Li Zhang
Abstract summary: We propose a balanced filter pruning method for both performance and pruning speed. Our method is able to prune a layer with approximate layer-wise optimal pruning rate at preset loss variation. The proposed pruning method is widely applicable to common architectures and does not involve any additional training except the final fine-tuning.
Score: 17.115185960327665
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Filter pruning has drawn more attention since resource constrained platform requires more compact model for deployment. However, current pruning methods suffer either from the inferior performance of one-shot methods, or the expensive time cost of iterative training methods. In this paper, we propose a balanced filter pruning method for both performance and pruning speed. Based on the filter importance criteria, our method is able to prune a layer with approximate layer-wise optimal pruning rate at preset loss variation. The network is pruned in the layer-wise way without the time consuming prune-retrain iteration. If a pre-defined pruning rate for the entire network is given, we also introduce a method to find the corresponding loss variation threshold with fast converging speed. Moreover, we propose the layer group pruning and channel selection mechanism for channel alignment in network with short connections. The proposed pruning method is widely applicable to common architectures and does not involve any additional training except the final fine-tuning. Comprehensive experiments show that our method outperforms many state-of-the-art approaches.

Related papers

Dynamic Structure Pruning for Compressing CNNs [13.73717878732162]
We introduce a novel structure pruning method, termed as dynamic structure pruning, to identify optimal pruning granularities for intra-channel pruning. The experimental results show that dynamic structure pruning achieves state-of-the-art pruning performance and better realistic acceleration on a GPU compared with channel pruning.
arXiv Detail & Related papers (2023-03-17T02:38:53Z)
Revisiting Random Channel Pruning for Neural Network Compression [159.99002793644163]
Channel (or 3D filter) pruning serves as an effective way to accelerate the inference of neural networks. In this paper, we try to determine the channel configuration of the pruned models by random search. We show that this simple strategy works quite well compared with other channel pruning methods.
arXiv Detail & Related papers (2022-05-11T17:59:04Z)
End-to-End Sensitivity-Based Filter Pruning [49.61707925611295]
We present a sensitivity-based filter pruning algorithm (SbF-Pruner) to learn the importance scores of filters of each layer end-to-end. Our method learns the scores from the filter weights, enabling it to account for the correlations between the filters of each layer.
arXiv Detail & Related papers (2022-04-15T10:21:05Z)
Data-Efficient Structured Pruning via Submodular Optimization [32.574190896543705]
We propose a data-efficient structured pruning method based on submodular optimization. We show that this selection problem is a weakly submodular problem, thus it can be provably approximated using an efficient greedy algorithm. Our method is one of the few in the literature that uses only a limited-number of training data and no labels.
arXiv Detail & Related papers (2022-03-09T18:40:29Z)
Pruning with Compensation: Efficient Channel Pruning for Deep Convolutional Neural Networks [0.9712140341805068]
A highly efficient pruning method is proposed to significantly reduce the cost of pruning DCNN. Our method shows competitive pruning performance among the state-of-the-art retraining-based pruning methods.
arXiv Detail & Related papers (2021-08-31T10:17:36Z)
Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks. The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z)
To Filter Prune, or to Layer Prune, That Is The Question [13.450136532402226]
We show the limitation of filter pruning methods in terms of latency reduction. We present a set of layer pruning methods based on different criteria that achieve higher latency reduction than filter pruning methods on similar accuracy. LayerPrune also outperforms handcrafted architectures such as Shufflenet, MobileNet, MNASNet and ResNet18 by 7.3%, 4.6%, 2.8% and 0.5% respectively on similar latency budget on ImageNet dataset.
arXiv Detail & Related papers (2020-07-11T02:51:40Z)
Dependency Aware Filter Pruning [74.69495455411987]
Pruning a proportion of unimportant filters is an efficient way to mitigate the inference cost. Previous work prunes filters according to their weight norms or the corresponding batch-norm scaling factors. We propose a novel mechanism to dynamically control the sparsity-inducing regularization so as to achieve the desired sparsity.
arXiv Detail & Related papers (2020-05-06T07:41:22Z)
Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression [145.04742985050808]
We analyze two popular network compression techniques, i.e. filter pruning and low-rank decomposition, in a unified sense. By changing the way the sparsity regularization is enforced, filter pruning and low-rank decomposition can be derived accordingly. Our approach proves its potential as it compares favorably to the state-of-the-art on several benchmarks.
arXiv Detail & Related papers (2020-03-19T17:57:26Z)
Lookahead: A Far-Sighted Alternative of Magnitude-based Pruning [83.99191569112682]
Magnitude-based pruning is one of the simplest methods for pruning neural networks. We develop a simple pruning method, coined lookahead pruning, by extending the single layer optimization to a multi-layer optimization. Our experimental results demonstrate that the proposed method consistently outperforms magnitude-based pruning on various networks.
arXiv Detail & Related papers (2020-02-12T05:38:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.