Model Pruning Based on Quantified Similarity of Feature Maps
- URL: http://arxiv.org/abs/2105.06052v1
- Date: Thu, 13 May 2021 02:57:30 GMT
- Title: Model Pruning Based on Quantified Similarity of Feature Maps
- Authors: Zidu Wang, Xuexin Liu, Long Huang, Yunqing Chen, Yufei Zhang, Zhikang
Lin, Rui Wang
- Abstract summary: We propose a novel theory to find redundant information in three dimensional tensors.
We use this theory to prune convolutional neural networks to enhance the inference speed.
- Score: 5.271060872578571
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A high-accuracy CNN is often accompanied by huge parameters, which are
usually stored in the high-dimensional tensors. However, there are few methods
can figure out the redundant information of the parameters stored in the
high-dimensional tensors, which leads to the lack of theoretical guidance for
the compression of CNNs. In this paper, we propose a novel theory to find
redundant information in three dimensional tensors, namely Quantified
Similarity of Feature Maps (QSFM), and use this theory to prune convolutional
neural networks to enhance the inference speed. Our method belongs to filter
pruning, which can be implemented without using any special libraries. We
perform our method not only on common convolution layers but also on special
convolution layers, such as depthwise separable convolution layers. The
experiments prove that QSFM can find the redundant information in the neural
network effectively. Without any fine-tuning operation, QSFM can compress
ResNet-56 on CIFAR-10 significantly (48.27% FLOPs and 57.90% parameters
reduction) with only a loss of 0.54% in the top-1 accuracy. QSFM also prunes
ResNet-56, VGG-16 and MobileNetV2 with fine-tuning operation, which also shows
excellent results.
Related papers
- An Effective Information Theoretic Framework for Channel Pruning [4.014774237233169]
We present a novel channel pruning approach via information theory and interpretability of neural networks.
Our method improves the accuracy by 0.21% when reducing 45.5% FLOPs and removing 40.3% parameters for ResNet-56 on CIFAR-10.
arXiv Detail & Related papers (2024-08-14T17:19:56Z) - Just How Flexible are Neural Networks in Practice? [89.80474583606242]
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters.
In practice, however, we only find solutions via our training procedure, including the gradient and regularizers, limiting flexibility.
arXiv Detail & Related papers (2024-06-17T12:24:45Z) - Towards Generalized Entropic Sparsification for Convolutional Neural Networks [0.0]
Convolutional neural networks (CNNs) are reported to be overparametrized.
Here, we introduce a layer-by-layer data-driven pruning method based on the mathematical idea aiming at a computationally-scalable entropic relaxation of the pruning problem.
The sparse subnetwork is found from the pre-trained (full) CNN using the network entropy minimization as a sparsity constraint.
arXiv Detail & Related papers (2024-04-06T21:33:39Z) - Filter Pruning For CNN With Enhanced Linear Representation Redundancy [3.853146967741941]
We present a data-driven loss function term calculated from the correlation matrix of different feature maps in the same layer, named CCM-loss.
CCM-loss provides us with another universal transcendental mathematical tool besides L*-norm regularization.
In our new strategy, we mainly focus on the consistency and integrality of the information flow in the network.
arXiv Detail & Related papers (2023-10-10T06:27:30Z) - Instant Complexity Reduction in CNNs using Locality-Sensitive Hashing [50.79602839359522]
We propose HASTE (Hashing for Tractable Efficiency), a parameter-free and data-free module that acts as a plug-and-play replacement for any regular convolution module.
We are able to drastically compress latent feature maps without sacrificing much accuracy by using locality-sensitive hashing (LSH)
In particular, we are able to instantly drop 46.72% of FLOPs while only losing 1.25% accuracy by just swapping the convolution modules in a ResNet34 on CIFAR-10 for our HASTE module.
arXiv Detail & Related papers (2023-09-29T13:09:40Z) - EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for
Mobile Vision Applications [68.35683849098105]
We introduce split depth-wise transpose attention (SDTA) encoder that splits input tensors into multiple channel groups.
Our EdgeNeXt model with 1.3M parameters achieves 71.2% top-1 accuracy on ImageNet-1K.
Our EdgeNeXt model with 5.6M parameters achieves 79.4% top-1 accuracy on ImageNet-1K.
arXiv Detail & Related papers (2022-06-21T17:59:56Z) - Toward Compact Parameter Representations for Architecture-Agnostic
Neural Network Compression [26.501979992447605]
This paper investigates compression from the perspective of compactly representing and storing trained parameters.
We leverage additive quantization, an extreme lossy compression method invented for image descriptors, to compactly represent the parameters.
We conduct experiments on MobileNet-v2, VGG-11, ResNet-50, Feature Pyramid Networks, and pruned DNNs trained for classification, detection, and segmentation tasks.
arXiv Detail & Related papers (2021-11-19T17:03:11Z) - Compact representations of convolutional neural networks via weight
pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization.
We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z) - Group Fisher Pruning for Practical Network Compression [58.25776612812883]
We present a general channel pruning approach that can be applied to various complicated structures.
We derive a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels.
Our method can be used to prune any structures including those with coupled channels.
arXiv Detail & Related papers (2021-08-02T08:21:44Z) - Layer Pruning via Fusible Residual Convolutional Block for Deep Neural
Networks [15.64167076052513]
layer pruning has less inference time and runtime memory usage when the same FLOPs and number of parameters are pruned.
We propose a simple layer pruning method using residual convolutional block (ResConv)
Our pruning method achieves excellent performance of compression and acceleration over the state-thearts on different datasets.
arXiv Detail & Related papers (2020-11-29T12:51:16Z) - SCOP: Scientific Control for Reliable Neural Network Pruning [127.20073865874636]
This paper proposes a reliable neural network pruning algorithm by setting up a scientific control.
Redundant filters can be discovered in the adversarial process of different features.
Our method can reduce 57.8% parameters and 60.2% FLOPs of ResNet-101 with only 0.01% top-1 accuracy loss on ImageNet.
arXiv Detail & Related papers (2020-10-21T03:02:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.