Convolutional Neural Network Compression through Generalized Kronecker
Product Decomposition
- URL: http://arxiv.org/abs/2109.14710v1
- Date: Wed, 29 Sep 2021 20:45:08 GMT
- Title: Convolutional Neural Network Compression through Generalized Kronecker
Product Decomposition
- Authors: Marawan Gamal Abdel Hameed, Marzieh S. Tahaei, Ali Mosleh, Vahid
Partovi Nia
- Abstract summary: We compress layers by generalizing the Kronecker Product Decomposition to apply to multidimensionals, leading to the Generalized Kronecker Product Decomposition(GKPD)
Our approach yields a plug-and-play module that can be used as a drop-in replacement for any convolutional layer.
- Score: 2.4240083226965115
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern Convolutional Neural Network (CNN) architectures, despite their
superiority in solving various problems, are generally too large to be deployed
on resource constrained edge devices. In this paper, we reduce memory usage and
floating-point operations required by convolutional layers in CNNs. We compress
these layers by generalizing the Kronecker Product Decomposition to apply to
multidimensional tensors, leading to the Generalized Kronecker Product
Decomposition(GKPD). Our approach yields a plug-and-play module that can be
used as a drop-in replacement for any convolutional layer. Experimental results
for image classification on CIFAR-10 and ImageNet datasets using ResNet,
MobileNetv2 and SeNet architectures substantiate the effectiveness of our
proposed approach. We find that GKPD outperforms state-of-the-art decomposition
methods including Tensor-Train and Tensor-Ring as well as other relevant
compression methods such as pruning and knowledge distillation.
Related papers
- Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Deep Multi-Threshold Spiking-UNet for Image Processing [51.88730892920031]
This paper introduces the novel concept of Spiking-UNet for image processing, which combines the power of Spiking Neural Networks (SNNs) with the U-Net architecture.
To achieve an efficient Spiking-UNet, we face two primary challenges: ensuring high-fidelity information propagation through the network via spikes and formulating an effective training strategy.
Experimental results show that, on image segmentation and denoising, our Spiking-UNet achieves comparable performance to its non-spiking counterpart.
arXiv Detail & Related papers (2023-07-20T16:00:19Z) - Mixed-TD: Efficient Neural Network Accelerator with Layer-Specific
Tensor Decomposition [7.221206118679026]
We propose a framework for mapping CNNs onto FPGAs based on a novel tensor decomposition method called Mixed-TD.
The proposed method applies layer-specific Singular Value Decomposition (SVD) and Canonical Polyadic Decomposition (CPD) in a mixed manner, achieving 1.73x to 10.29x throughput per DSP to state-of-the-art CNNs.
arXiv Detail & Related papers (2023-06-08T08:16:38Z) - Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - Nonlinear Tensor Ring Network [39.89070144585793]
State-of-the-art deep neural networks (DNNs) have been widely applied for various real-world applications, and achieved significant performance for cognitive problems.
By converting redundant models into compact ones, compression technique appears to be a practical solution to reducing the storage and memory consumption.
In this paper, we develop a nonlinear tensor ring network (NTRN) in which both fullyconnected and convolutional layers are compressed.
arXiv Detail & Related papers (2021-11-12T02:02:55Z) - Spatial Dependency Networks: Neural Layers for Improved Generative Image
Modeling [79.15521784128102]
We introduce a novel neural network for building image generators (decoders) and apply it to variational autoencoders (VAEs)
In our spatial dependency networks (SDNs), feature maps at each level of a deep neural net are computed in a spatially coherent way.
We show that augmenting the decoder of a hierarchical VAE by spatial dependency layers considerably improves density estimation.
arXiv Detail & Related papers (2021-03-16T07:01:08Z) - Tensor Reordering for CNN Compression [7.228285747845778]
We show how parameter redundancy in Convolutional Neural Network (CNN) filters can be effectively reduced by pruning in spectral domain.
Our approach is applied to pretrained CNNs and we show that minor additional fine-tuning allows our method to recover the original model performance.
arXiv Detail & Related papers (2020-10-22T23:45:34Z) - Structured Convolutions for Efficient Neural Network Design [65.36569572213027]
We tackle model efficiency by exploiting redundancy in the textitimplicit structure of the building blocks of convolutional neural networks.
We show how this decomposition can be applied to 2D and 3D kernels as well as the fully-connected layers.
arXiv Detail & Related papers (2020-08-06T04:38:38Z) - When Residual Learning Meets Dense Aggregation: Rethinking the
Aggregation of Deep Neural Networks [57.0502745301132]
We propose Micro-Dense Nets, a novel architecture with global residual learning and local micro-dense aggregations.
Our micro-dense block can be integrated with neural architecture search based models to boost their performance.
arXiv Detail & Related papers (2020-04-19T08:34:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.