Drastically Reducing the Number of Trainable Parameters in Deep CNNs by
Inter-layer Kernel-sharing
- URL: http://arxiv.org/abs/2210.14151v1
- Date: Sun, 23 Oct 2022 18:14:30 GMT
- Title: Drastically Reducing the Number of Trainable Parameters in Deep CNNs by
Inter-layer Kernel-sharing
- Authors: Alireza Azadbakht, Saeed Reza Kheradpisheh, Ismail Khalfaoui-Hassani,
Timoth\'ee Masquelier
- Abstract summary: Deep convolutional neural networks (DCNNs) have become the state-of-the-art (SOTA) approach for many computer vision tasks.
Here, we suggest a simple way to reduce the number of trainable parameters and thus the memory footprint: sharing kernels between multiple convolutional layers.
- Score: 0.4129225533930965
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep convolutional neural networks (DCNNs) have become the state-of-the-art
(SOTA) approach for many computer vision tasks: image classification, object
detection, semantic segmentation, etc. However, most SOTA networks are too
large for edge computing. Here, we suggest a simple way to reduce the number of
trainable parameters and thus the memory footprint: sharing kernels between
multiple convolutional layers. Kernel-sharing is only possible between
``isomorphic" layers, i.e.layers having the same kernel size, input and output
channels. This is typically the case inside each stage of a DCNN. Our
experiments on CIFAR-10 and CIFAR-100, using the ConvMixer and SE-ResNet
architectures show that the number of parameters of these models can
drastically be reduced with minimal cost on accuracy. The resulting networks
are appealing for certain edge computing applications that are subject to
severe memory constraints, and even more interesting if leveraging "frozen
weights" hardware accelerators. Kernel-sharing is also an efficient
regularization method, which can reduce overfitting. The codes are publicly
available at https://github.com/AlirezaAzadbakht/kernel-sharing.
Related papers
- Convolutional Deep Kernel Machines [25.958907308877148]
Recent work modified the Neural Network Gaussian Process (NNGP) limit of Bayesian neural networks so that representation learning is retained.
Applying this modified limit to a deep Gaussian process gives a practical learning algorithm which they dubbed the deep kernel machine (DKM)
arXiv Detail & Related papers (2023-09-18T14:36:17Z) - Efficient Dataset Distillation Using Random Feature Approximation [109.07737733329019]
We propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel.
Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU.
Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets.
arXiv Detail & Related papers (2022-10-21T15:56:13Z) - EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for
Mobile Vision Applications [68.35683849098105]
We introduce split depth-wise transpose attention (SDTA) encoder that splits input tensors into multiple channel groups.
Our EdgeNeXt model with 1.3M parameters achieves 71.2% top-1 accuracy on ImageNet-1K.
Our EdgeNeXt model with 5.6M parameters achieves 79.4% top-1 accuracy on ImageNet-1K.
arXiv Detail & Related papers (2022-06-21T17:59:56Z) - Hyper-Convolutions via Implicit Kernels for Medical Imaging [18.98078260974008]
We present the textithyper-convolution, a novel building block that implicitly encodes the convolutional kernel using spatial coordinates.
We demonstrate in our experiments that replacing regular convolutions with hyper-convolutions can improve performance with less parameters, and increase robustness against noise.
arXiv Detail & Related papers (2022-02-06T03:56:19Z) - Instant Neural Graphics Primitives with a Multiresolution Hash Encoding [67.33850633281803]
We present a versatile new input encoding that permits the use of a smaller network without sacrificing quality.
A small neural network is augmented by a multiresolution hash table of trainable feature vectors whose values are optimized through a gradient descent.
We achieve a combined speed of several orders of magnitude, enabling training of high-quality neural graphics primitives in a matter of seconds.
arXiv Detail & Related papers (2022-01-16T07:22:47Z) - Fast and High-Quality Image Denoising via Malleable Convolutions [72.18723834537494]
We present Malleable Convolution (MalleConv), as an efficient variant of dynamic convolution.
Unlike previous works, MalleConv generates a much smaller set of spatially-varying kernels from input.
We also build an efficient denoising network using MalleConv, coined as MalleNet.
arXiv Detail & Related papers (2022-01-02T18:35:20Z) - FlexConv: Continuous Kernel Convolutions with Differentiable Kernel
Sizes [34.90912459206022]
Recent works show CNNs benefit from different kernel sizes at different layers, but exploring all possible combinations is unfeasible in practice.
We propose FlexConv, a novel convolutional operation with which high bandwidth convolutional kernels of learnable kernel size can be learned at a fixed parameter cost.
arXiv Detail & Related papers (2021-10-15T12:35:49Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization.
Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z) - Cross-filter compression for CNN inference acceleration [4.324080238456531]
We propose a new cross-filter compression method that can provide $sim32times$ memory savings and $122times$ speed up in convolution operations.
Our method, based on Binary-Weight and XNOR-Net separately, is evaluated on CIFAR-10 and ImageNet dataset.
arXiv Detail & Related papers (2020-05-18T19:06:14Z) - Learning Sparse & Ternary Neural Networks with Entropy-Constrained
Trained Ternarization (EC2T) [17.13246260883765]
Deep neural networks (DNNs) have shown remarkable success in a variety of machine learning applications.
In recent years, there is an increasing interest in deploying DNNs to resource-constrained devices with limited energy, memory, and computational budget.
We propose Entropy-Constrained Trained Ternarization (EC2T), a general framework to create sparse and ternary neural networks.
arXiv Detail & Related papers (2020-04-02T15:38:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.