Low-rank Tensor Decomposition for Compression of Convolutional Neural
Networks Using Funnel Regularization
- URL: http://arxiv.org/abs/2112.03690v1
- Date: Tue, 7 Dec 2021 13:41:51 GMT
- Title: Low-rank Tensor Decomposition for Compression of Convolutional Neural
Networks Using Funnel Regularization
- Authors: Bo-Shiuan Chu, Che-Rung Lee
- Abstract summary: We propose a model reduction method to compress the pre-trained networks using low-rank tensor decomposition.
A new regularization method, called funnel function, is proposed to suppress the unimportant factors during the compression.
For ResNet18 with ImageNet2012, our reduced model can reach more than twi times speed up in terms of GMAC with merely 0.7% Top-1 accuracy drop.
- Score: 1.8579693774597708
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Tensor decomposition is one of the fundamental technique for model
compression of deep convolution neural networks owing to its ability to reveal
the latent relations among complex structures. However, most existing methods
compress the networks layer by layer, which cannot provide a satisfactory
solution to achieve global optimization. In this paper, we proposed a model
reduction method to compress the pre-trained networks using low-rank tensor
decomposition of the convolution layers. Our method is based on the
optimization techniques to select the proper ranks of decomposed network
layers. A new regularization method, called funnel function, is proposed to
suppress the unimportant factors during the compression, so the proper ranks
can be revealed much easier. The experimental results show that our algorithm
can reduce more model parameters than other tensor compression methods. For
ResNet18 with ImageNet2012, our reduced model can reach more than twi times
speed up in terms of GMAC with merely 0.7% Top-1 accuracy drop, which
outperforms most existing methods in both metrics.
Related papers
- Convolutional Neural Network Compression Based on Low-Rank Decomposition [3.3295360710329738]
This paper proposes a model compression method that integrates Variational Bayesian Matrix Factorization.
VBMF is employed to estimate the rank of the weight tensor at each layer.
Experimental results show that for both high and low compression ratios, our compression model exhibits advanced performance.
arXiv Detail & Related papers (2024-08-29T06:40:34Z) - Compression of Recurrent Neural Networks using Matrix Factorization [0.9208007322096533]
We propose a post-training rank-selection method called Rank-Tuning that selects a different rank for each matrix.
Our numerical experiments on signal processing tasks show that we can compress recurrent neural networks up to 14x with at most 1.4% relative performance reduction.
arXiv Detail & Related papers (2023-10-19T12:35:30Z) - Approximating Continuous Convolutions for Deep Network Compression [11.566258236184964]
We present ApproxConv, a novel method for compressing the layers of a convolutional neural network.
We show that our method is able to compress existing deep network models by half whilst losing only 1.86% accuracy.
arXiv Detail & Related papers (2022-10-17T11:41:26Z) - Reducing The Amortization Gap of Entropy Bottleneck In End-to-End Image
Compression [2.1485350418225244]
End-to-end deep trainable models are about to exceed the performance of the traditional handcrafted compression techniques on videos and images.
We propose a simple yet efficient instance-based parameterization method to reduce this amortization gap at a minor cost.
arXiv Detail & Related papers (2022-09-02T11:43:45Z) - Compressing Neural Networks: Towards Determining the Optimal Layer-wise
Decomposition [62.41259783906452]
We present a novel global compression framework for deep neural networks.
It automatically analyzes each layer to identify the optimal per-layer compression ratio.
Our results open up new avenues for future research into the global performance-size trade-offs of modern neural networks.
arXiv Detail & Related papers (2021-07-23T20:01:30Z) - Non-Gradient Manifold Neural Network [79.44066256794187]
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent.
We propose a novel manifold neural network based on non-gradient optimization.
arXiv Detail & Related papers (2021-06-15T06:39:13Z) - Towards Compact CNNs via Collaborative Compression [166.86915086497433]
We propose a Collaborative Compression scheme, which joints channel pruning and tensor decomposition to compress CNN models.
We achieve 52.9% FLOPs reduction by removing 48.4% parameters on ResNet-50 with only a Top-1 accuracy drop of 0.56% on ImageNet 2012.
arXiv Detail & Related papers (2021-05-24T12:07:38Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - An Efficient Statistical-based Gradient Compression Technique for
Distributed Training Systems [77.88178159830905]
Sparsity-Inducing Distribution-based Compression (SIDCo) is a threshold-based sparsification scheme that enjoys similar threshold estimation quality to deep gradient compression (DGC)
Our evaluation shows SIDCo speeds up training by up to 41:7%, 7:6%, and 1:9% compared to the no-compression baseline, Topk, and DGC compressors, respectively.
arXiv Detail & Related papers (2021-01-26T13:06:00Z) - Structured Sparsification with Joint Optimization of Group Convolution
and Channel Shuffle [117.95823660228537]
We propose a novel structured sparsification method for efficient network compression.
The proposed method automatically induces structured sparsity on the convolutional weights.
We also address the problem of inter-group communication with a learnable channel shuffle mechanism.
arXiv Detail & Related papers (2020-02-19T12:03:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.