Group Fisher Pruning for Practical Network Compression
- URL: http://arxiv.org/abs/2108.00708v1
- Date: Mon, 2 Aug 2021 08:21:44 GMT
- Title: Group Fisher Pruning for Practical Network Compression
- Authors: Liyang Liu, Shilong Zhang, Zhanghui Kuang, Aojun Zhou, Jing-Hao Xue,
Xinjiang Wang, Yimin Chen, Wenming Yang, Qingmin Liao, Wayne Zhang
- Abstract summary: We present a general channel pruning approach that can be applied to various complicated structures.
We derive a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels.
Our method can be used to prune any structures including those with coupled channels.
- Score: 58.25776612812883
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Network compression has been widely studied since it is able to reduce the
memory and computation cost during inference. However, previous methods seldom
deal with complicated structures like residual connections, group/depth-wise
convolution and feature pyramid network, where channels of multiple layers are
coupled and need to be pruned simultaneously. In this paper, we present a
general channel pruning approach that can be applied to various complicated
structures. Particularly, we propose a layer grouping algorithm to find coupled
channels automatically. Then we derive a unified metric based on Fisher
information to evaluate the importance of a single channel and coupled
channels. Moreover, we find that inference speedup on GPUs is more correlated
with the reduction of memory rather than FLOPs, and thus we employ the memory
reduction of each channel to normalize the importance. Our method can be used
to prune any structures including those with coupled channels. We conduct
extensive experiments on various backbones, including the classic ResNet and
ResNeXt, mobile-friendly MobileNetV2, and the NAS-based RegNet, both on image
classification and object detection which is under-explored. Experimental
results validate that our method can effectively prune sophisticated networks,
boosting inference speed without sacrificing accuracy.
Related papers
- Joint Channel Estimation and Feedback with Masked Token Transformers in
Massive MIMO Systems [74.52117784544758]
This paper proposes an encoder-decoder based network that unveils the intrinsic frequency-domain correlation within the CSI matrix.
The entire encoder-decoder network is utilized for channel compression.
Our method outperforms state-of-the-art channel estimation and feedback techniques in joint tasks.
arXiv Detail & Related papers (2023-06-08T06:15:17Z) - Group channel pruning and spatial attention distilling for object
detection [2.8675002818821542]
We introduce a three-stage model compression method: dynamic sparse training, group channel pruning, and spatial attention distilling.
Our method reduces the parameters of the model by 64.7 % and the calculation by 34.9%.
arXiv Detail & Related papers (2023-06-02T13:26:23Z) - Revisiting Random Channel Pruning for Neural Network Compression [159.99002793644163]
Channel (or 3D filter) pruning serves as an effective way to accelerate the inference of neural networks.
In this paper, we try to determine the channel configuration of the pruned models by random search.
We show that this simple strategy works quite well compared with other channel pruning methods.
arXiv Detail & Related papers (2022-05-11T17:59:04Z) - Low Complexity Channel estimation with Neural Network Solutions [1.0499453838486013]
We deploy a general residual convolutional neural network to achieve channel estimation in a downlink scenario.
Compared with other deep learning methods for channel estimation, our results suggest improved mean squared error computation.
arXiv Detail & Related papers (2022-01-24T19:55:10Z) - CONetV2: Efficient Auto-Channel Size Optimization for CNNs [35.951376988552695]
This work introduces a method that is efficient in computationally constrained environments by examining the micro-search space of channel size.
In tackling channel-size optimization, we design an automated algorithm to extract the dependencies within different connected layers of the network.
We also introduce a novel metric that highly correlates with test accuracy and enables analysis of individual network layers.
arXiv Detail & Related papers (2021-10-13T16:17:19Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - Operation-Aware Soft Channel Pruning using Differentiable Masks [51.04085547997066]
We propose a data-driven algorithm, which compresses deep neural networks in a differentiable way by exploiting the characteristics of operations.
We perform extensive experiments and achieve outstanding performance in terms of the accuracy of output networks.
arXiv Detail & Related papers (2020-07-08T07:44:00Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.