Rescaling CNN through Learnable Repetition of Network Parameters
- URL: http://arxiv.org/abs/2101.05650v1
- Date: Thu, 14 Jan 2021 15:03:25 GMT
- Title: Rescaling CNN through Learnable Repetition of Network Parameters
- Authors: Arnav Chavan, Udbhav Bamba, Rishabh Tiwari, Deepak Gupta
- Abstract summary: We present a novel rescaling strategy for CNNs based on learnable repetition of its parameters.
We show that small base networks when rescaled, can provide performance comparable to deeper networks with as low as 6% of optimization parameters of the deeper one.
- Score: 2.137666194897132
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deeper and wider CNNs are known to provide improved performance for deep
learning tasks. However, most such networks have poor performance gain per
parameter increase. In this paper, we investigate whether the gain observed in
deeper models is purely due to the addition of more optimization parameters or
whether the physical size of the network as well plays a role. Further, we
present a novel rescaling strategy for CNNs based on learnable repetition of
its parameters. Based on this strategy, we rescale CNNs without changing their
parameter count, and show that learnable sharing of weights itself can provide
significant boost in the performance of any given model without changing its
parameter count. We show that small base networks when rescaled, can provide
performance comparable to deeper networks with as low as 6% of optimization
parameters of the deeper one.
The relevance of weight sharing is further highlighted through the example of
group-equivariant CNNs. We show that the significant improvements obtained with
group-equivariant CNNs over the regular CNNs on classification problems are
only partly due to the added equivariance property, and part of it comes from
the learnable repetition of network weights. For rot-MNIST dataset, we show
that up to 40% of the relative gain reported by state-of-the-art methods for
rotation equivariance could actually be due to just the learnt repetition of
weights.
Related papers
- Improving the Accuracy and Robustness of CNNs Using a Deep CCA Neural
Data Regularizer [2.026424957803652]
As convolutional neural networks (CNNs) become more accurate at object recognition, their representations become more similar to the primate visual system.
Previous attempts to address this question showed very modest gains in accuracy, owing in part to limitations of the regularization method.
We develop a new neural data regularizer for CNNs that uses Deep Correlation Analysis (DCCA) to optimize the resemblance of the CNN's image representations to that of the monkey visual cortex.
arXiv Detail & Related papers (2022-09-06T15:40:39Z) - Fusion of CNNs and statistical indicators to improve image
classification [65.51757376525798]
Convolutional Networks have dominated the field of computer vision for the last ten years.
Main strategy to prolong this trend relies on further upscaling networks in size.
We hypothesise that adding heterogeneous sources of information may be more cost-effective to a CNN than building a bigger network.
arXiv Detail & Related papers (2020-12-20T23:24:31Z) - Towards Better Accuracy-efficiency Trade-offs: Divide and Co-training [24.586453683904487]
We argue that increasing the number of networks (ensemble) can achieve better accuracy-efficiency trade-offs than purely increasing the width.
Small networks can achieve better ensemble performance than the large one with few or no extra parameters or FLOPs.
arXiv Detail & Related papers (2020-11-30T10:03:34Z) - MGIC: Multigrid-in-Channels Neural Network Architectures [8.459177309094688]
We present a multigrid-in-channels approach that tackles the quadratic growth of the number of parameters with respect to the number of channels in standard convolutional neural networks (CNNs)
Our approach addresses the redundancy in CNNs that is also exposed by the recent success of lightweight CNNs.
arXiv Detail & Related papers (2020-11-17T11:29:10Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - Exploring Deep Hybrid Tensor-to-Vector Network Architectures for
Regression Based Speech Enhancement [53.47564132861866]
We find that a hybrid architecture, namely CNN-TT, is capable of maintaining a good quality performance with a reduced model parameter size.
CNN-TT is composed of several convolutional layers at the bottom for feature extraction to improve speech quality.
arXiv Detail & Related papers (2020-07-25T22:21:05Z) - What Deep CNNs Benefit from Global Covariance Pooling: An Optimization
Perspective [102.37204254403038]
We make an attempt to understand what deep CNNs benefit from GCP in a viewpoint of optimization.
We show that GCP can make the optimization landscape more smooth and the gradients more predictive.
We conduct extensive experiments using various deep CNN models on diversified tasks, and the results provide strong support to our findings.
arXiv Detail & Related papers (2020-03-25T07:00:45Z) - Curriculum By Smoothing [52.08553521577014]
Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification, detection, and segmentation.
We propose an elegant curriculum based scheme that smoothes the feature embedding of a CNN using anti-aliasing or low-pass filters.
As the amount of information in the feature maps increases during training, the network is able to progressively learn better representations of the data.
arXiv Detail & Related papers (2020-03-03T07:27:44Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.