Joint Matrix Decomposition for Deep Convolutional Neural Networks
Compression
- URL: http://arxiv.org/abs/2107.04386v2
- Date: Mon, 12 Jul 2021 03:06:42 GMT
- Title: Joint Matrix Decomposition for Deep Convolutional Neural Networks
Compression
- Authors: Shaowu Chen, Jiahao Zhou, Weize Sun, Lei Huang
- Abstract summary: Deep convolutional neural networks (CNNs) with a large number of parameters requires huge computational resources.
Decomposition-based methods, therefore, have been utilized to compress CNNs in recent years.
We propose to compress CNNs and alleviate performance degradation via joint matrix decomposition.
- Score: 5.083621265568845
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep convolutional neural networks (CNNs) with a large number of parameters
requires huge computational resources, which has limited the application of
CNNs on resources constrained appliances. Decomposition-based methods,
therefore, have been utilized to compress CNNs in recent years. However, since
the compression factor and performance are negatively correlated, the
state-of-the-art works either suffer from severe performance degradation or
have limited low compression factors. To overcome these problems, unlike
previous works compressing layers separately, we propose to compress CNNs and
alleviate performance degradation via joint matrix decomposition. The idea is
inspired by the fact that there are lots of repeated modules in CNNs, and by
projecting weights with the same structures into the same subspace, networks
can be further compressed and even accelerated. In particular, three joint
matrix decomposition schemes are developed, and the corresponding optimization
approaches based on Singular Values Decomposition are proposed. Extensive
experiments are conducted across three challenging compact CNNs and 3 benchmark
data sets to demonstrate the superior performance of our proposed algorithms.
As a result, our methods can compress the size of ResNet-34 by 22x with
slighter accuracy degradation compared with several state-of-the-art methods.
Related papers
- Reduced storage direct tensor ring decomposition for convolutional neural networks compression [0.0]
We propose a novel low-rank CNNs compression method based on reduced storage direct tensor ring decomposition (RSDTR)
The proposed method offers a higher circular mode permutation flexibility, and it is characterized by large parameter and FLOPS compression rates.
Experiments, performed on the CIFAR-10 and ImageNet datasets, clearly demonstrate the efficiency of RSDTR in comparison to other state-of-the-art CNNs compression approaches.
arXiv Detail & Related papers (2024-05-17T14:16:40Z) - Convolutional Neural Network Compression via Dynamic Parameter Rank
Pruning [4.7027290803102675]
We propose an efficient training method for CNN compression via dynamic parameter rank pruning.
Our experiments show that the proposed method can yield substantial storage savings while maintaining or even enhancing classification performance.
arXiv Detail & Related papers (2024-01-15T23:52:35Z) - Low-rank Tensor Decomposition for Compression of Convolutional Neural
Networks Using Funnel Regularization [1.8579693774597708]
We propose a model reduction method to compress the pre-trained networks using low-rank tensor decomposition.
A new regularization method, called funnel function, is proposed to suppress the unimportant factors during the compression.
For ResNet18 with ImageNet2012, our reduced model can reach more than twi times speed up in terms of GMAC with merely 0.7% Top-1 accuracy drop.
arXiv Detail & Related papers (2021-12-07T13:41:51Z) - Compact representations of convolutional neural networks via weight
pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization.
We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z) - Compressing Neural Networks: Towards Determining the Optimal Layer-wise
Decomposition [62.41259783906452]
We present a novel global compression framework for deep neural networks.
It automatically analyzes each layer to identify the optimal per-layer compression ratio.
Our results open up new avenues for future research into the global performance-size trade-offs of modern neural networks.
arXiv Detail & Related papers (2021-07-23T20:01:30Z) - Efficient Micro-Structured Weight Unification and Pruning for Neural
Network Compression [56.83861738731913]
Deep Neural Network (DNN) models are essential for practical applications, especially for resource limited devices.
Previous unstructured or structured weight pruning methods can hardly truly accelerate inference.
We propose a generalized weight unification framework at a hardware compatible micro-structured level to achieve high amount of compression and acceleration.
arXiv Detail & Related papers (2021-06-15T17:22:59Z) - Towards Compact CNNs via Collaborative Compression [166.86915086497433]
We propose a Collaborative Compression scheme, which joints channel pruning and tensor decomposition to compress CNN models.
We achieve 52.9% FLOPs reduction by removing 48.4% parameters on ResNet-50 with only a Top-1 accuracy drop of 0.56% on ImageNet 2012.
arXiv Detail & Related papers (2021-05-24T12:07:38Z) - Permute, Quantize, and Fine-tune: Efficient Compression of Neural
Networks [70.0243910593064]
Key to success of vector quantization is deciding which parameter groups should be compressed together.
In this paper we make the observation that the weights of two adjacent layers can be permuted while expressing the same function.
We then establish a connection to rate-distortion theory and search for permutations that result in networks that are easier to compress.
arXiv Detail & Related papers (2020-10-29T15:47:26Z) - Compression strategies and space-conscious representations for deep
neural networks [0.3670422696827526]
Recent advances in deep learning have made available powerful convolutional neural networks (CNN) with state-of-the-art performance in several real-world applications.
CNNs have millions of parameters, thus they are not deployable on resource-limited platforms.
In this paper, we investigate the impact of lossy compression of CNNs by weight pruning and quantization.
arXiv Detail & Related papers (2020-07-15T19:41:19Z) - Structured Sparsification with Joint Optimization of Group Convolution
and Channel Shuffle [117.95823660228537]
We propose a novel structured sparsification method for efficient network compression.
The proposed method automatically induces structured sparsity on the convolutional weights.
We also address the problem of inter-group communication with a learnable channel shuffle mechanism.
arXiv Detail & Related papers (2020-02-19T12:03:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.