Convolutional Neural Network Compression via Dynamic Parameter Rank
Pruning
- URL: http://arxiv.org/abs/2401.08014v1
- Date: Mon, 15 Jan 2024 23:52:35 GMT
- Title: Convolutional Neural Network Compression via Dynamic Parameter Rank
Pruning
- Authors: Manish Sharma, Jamison Heard, Eli Saber, Panos P. Markopoulos
- Abstract summary: We propose an efficient training method for CNN compression via dynamic parameter rank pruning.
Our experiments show that the proposed method can yield substantial storage savings while maintaining or even enhancing classification performance.
- Score: 4.7027290803102675
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While Convolutional Neural Networks (CNNs) excel at learning complex
latent-space representations, their over-parameterization can lead to
overfitting and reduced performance, particularly with limited data. This,
alongside their high computational and memory demands, limits the applicability
of CNNs for edge deployment. Low-rank matrix approximation has emerged as a
promising approach to reduce CNN parameters, but its application presents
challenges including rank selection and performance loss. To address these
issues, we propose an efficient training method for CNN compression via dynamic
parameter rank pruning. Our approach integrates efficient matrix factorization
and novel regularization techniques, forming a robust framework for dynamic
rank reduction and model compression. We use Singular Value Decomposition (SVD)
to model low-rank convolutional filters and dense weight matrices and we
achieve model compression by training the SVD factors with back-propagation in
an end-to-end way. We evaluate our method on an array of modern CNNs, including
ResNet-18, ResNet-20, and ResNet-32, and datasets like CIFAR-10, CIFAR-100, and
ImageNet (2012), showcasing its applicability in computer vision. Our
experiments show that the proposed method can yield substantial storage savings
while maintaining or even enhancing classification performance.
Related papers
- Dynamic Semantic Compression for CNN Inference in Multi-access Edge
Computing: A Graph Reinforcement Learning-based Autoencoder [82.8833476520429]
We propose a novel semantic compression method, autoencoder-based CNN architecture (AECNN) for effective semantic extraction and compression in partial offloading.
In the semantic encoder, we introduce a feature compression module based on the channel attention mechanism in CNNs, to compress intermediate data by selecting the most informative features.
In the semantic decoder, we design a lightweight decoder to reconstruct the intermediate data through learning from the received compressed data to improve accuracy.
arXiv Detail & Related papers (2024-01-19T15:19:47Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - Compact representations of convolutional neural networks via weight
pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization.
We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z) - Learning from Images: Proactive Caching with Parallel Convolutional
Neural Networks [94.85780721466816]
A novel framework for proactive caching is proposed in this paper.
It combines model-based optimization with data-driven techniques by transforming an optimization problem into a grayscale image.
Numerical results show that the proposed scheme can reduce 71.6% computation time with only 0.8% additional performance cost.
arXiv Detail & Related papers (2021-08-15T21:32:47Z) - Joint Matrix Decomposition for Deep Convolutional Neural Networks
Compression [5.083621265568845]
Deep convolutional neural networks (CNNs) with a large number of parameters requires huge computational resources.
Decomposition-based methods, therefore, have been utilized to compress CNNs in recent years.
We propose to compress CNNs and alleviate performance degradation via joint matrix decomposition.
arXiv Detail & Related papers (2021-07-09T12:32:10Z) - Rate Distortion Characteristic Modeling for Neural Image Compression [59.25700168404325]
End-to-end optimization capability offers neural image compression (NIC) superior lossy compression performance.
distinct models are required to be trained to reach different points in the rate-distortion (R-D) space.
We make efforts to formulate the essential mathematical functions to describe the R-D behavior of NIC using deep network and statistical modeling.
arXiv Detail & Related papers (2021-06-24T12:23:05Z) - Efficient Micro-Structured Weight Unification and Pruning for Neural
Network Compression [56.83861738731913]
Deep Neural Network (DNN) models are essential for practical applications, especially for resource limited devices.
Previous unstructured or structured weight pruning methods can hardly truly accelerate inference.
We propose a generalized weight unification framework at a hardware compatible micro-structured level to achieve high amount of compression and acceleration.
arXiv Detail & Related papers (2021-06-15T17:22:59Z) - Tensor Reordering for CNN Compression [7.228285747845778]
We show how parameter redundancy in Convolutional Neural Network (CNN) filters can be effectively reduced by pruning in spectral domain.
Our approach is applied to pretrained CNNs and we show that minor additional fine-tuning allows our method to recover the original model performance.
arXiv Detail & Related papers (2020-10-22T23:45:34Z) - Compression strategies and space-conscious representations for deep
neural networks [0.3670422696827526]
Recent advances in deep learning have made available powerful convolutional neural networks (CNN) with state-of-the-art performance in several real-world applications.
CNNs have millions of parameters, thus they are not deployable on resource-limited platforms.
In this paper, we investigate the impact of lossy compression of CNNs by weight pruning and quantization.
arXiv Detail & Related papers (2020-07-15T19:41:19Z) - CNN Acceleration by Low-rank Approximation with Quantized Factors [9.654865591431593]
The modern convolutional neural networks although achieve great results in solving complex computer vision tasks still cannot be effectively used in mobile and embedded devices.
In order to solve this problem the novel approach combining two known methods, low-rank tensor approximation in Tucker format and quantization of weights and feature maps (activations) is proposed.
The efficiency of our method is demonstrated for ResNet18 and ResNet34 on CIFAR-10, CIFAR-100 and Imagenet classification tasks.
arXiv Detail & Related papers (2020-06-16T02:28:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.