ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution
- URL: http://arxiv.org/abs/2009.02386v1
- Date: Fri, 4 Sep 2020 20:41:47 GMT
- Title: ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution
- Authors: Ze Wang, Xiuyuan Cheng, Guillermo Sapiro, Qiang Qiu
- Abstract summary: We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
- Score: 57.635467829558664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Convolutional Neural Networks (CNNs) are known to be significantly
over-parametrized, and difficult to interpret, train and adapt. In this paper,
we introduce a structural regularization across convolutional kernels in a CNN.
In our approach, each convolution kernel is first decomposed as 2D dictionary
atoms linearly combined by coefficients. The widely observed correlation and
redundancy in a CNN hint a common low-rank structure among the decomposed
coefficients, which is here further supported by our empirical observations. We
then explicitly regularize CNN kernels by enforcing decomposed coefficients to
be shared across sub-structures, while leaving each sub-structure only its own
dictionary atoms, a few hundreds of parameters typically, which leads to
dramatic model reductions. We explore models with sharing across different
sub-structures to cover a wide range of trade-offs between parameter reduction
and expressiveness. Our proposed regularized network structures open the door
to better interpreting, training and adapting deep models. We validate the
flexibility and compatibility of our method by image classification experiments
on multiple datasets and underlying network structures, and show that CNNs now
maintain performance with dramatic reduction in parameters and computations,
e.g., only 5\% parameters are used in a ResNet-18 to achieve comparable
performance. Further experiments on few-shot classification show that faster
and more robust task adaptation is obtained in comparison with models with
standard convolutions.
Related papers
- Isomorphic Pruning for Vision Models [56.286064975443026]
Structured pruning reduces the computational overhead of deep neural networks by removing redundant sub-structures.
We present Isomorphic Pruning, a simple approach that demonstrates effectiveness across a range of network architectures.
arXiv Detail & Related papers (2024-07-05T16:14:53Z) - On the rates of convergence for learning with convolutional neural networks [9.772773527230134]
We study approximation and learning capacities of convolutional neural networks (CNNs) with one-side zero-padding and multiple channels.
We derive convergence rates for estimators based on CNNs in many learning problems.
It is also shown that the obtained rates for classification are minimax optimal in some common settings.
arXiv Detail & Related papers (2024-03-25T06:42:02Z) - Adaptive aggregation of Monte Carlo augmented decomposed filters for efficient group-equivariant convolutional neural network [0.36122488107441414]
Group-equivariant convolutional neural networks (G-CNN) heavily rely on parameter sharing to increase CNN's data efficiency and performance.
We propose a non- parameter-sharing approach for group equivariant neural networks.
The proposed methods adaptively aggregate a diverse range of filters by a weighted sum of decomposedally augmented filters.
arXiv Detail & Related papers (2023-05-17T10:18:02Z) - Learning Partial Correlation based Deep Visual Representation for Image
Classification [61.0532370259644]
We formulate sparse inverse covariance estimation (SICE) as a novel structured layer of CNN.
Our work obtains a partial correlation based deep visual representation and mitigates the small sample problem.
Experiments show the efficacy and superior classification performance of our model.
arXiv Detail & Related papers (2023-04-23T10:09:01Z) - The Power of Linear Combinations: Learning with Random Convolutions [2.0305676256390934]
Modern CNNs can achieve high test accuracies without ever updating randomly (spatial) convolution filters.
These combinations of random filters can implicitly regularize the resulting operations.
Although we only observe relatively small gains from learning $3times 3$ convolutions, the learning gains increase proportionally with kernel size.
arXiv Detail & Related papers (2023-01-26T19:17:10Z) - Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - What Can Be Learnt With Wide Convolutional Neural Networks? [69.55323565255631]
We study infinitely-wide deep CNNs in the kernel regime.
We prove that deep CNNs adapt to the spatial scale of the target function.
We conclude by computing the generalisation error of a deep CNN trained on the output of another deep CNN.
arXiv Detail & Related papers (2022-08-01T17:19:32Z) - Quantized convolutional neural networks through the lens of partial
differential equations [6.88204255655161]
Quantization of Convolutional Neural Networks (CNNs) is a common approach to ease the computational burden involved in the deployment of CNNs.
In this work, we explore ways to improve quantized CNNs using PDE-based perspective and analysis.
arXiv Detail & Related papers (2021-08-31T22:18:52Z) - Structured Convolutions for Efficient Neural Network Design [65.36569572213027]
We tackle model efficiency by exploiting redundancy in the textitimplicit structure of the building blocks of convolutional neural networks.
We show how this decomposition can be applied to 2D and 3D kernels as well as the fully-connected layers.
arXiv Detail & Related papers (2020-08-06T04:38:38Z) - Learning Sparse Filters in Deep Convolutional Neural Networks with a
l1/l2 Pseudo-Norm [5.3791844634527495]
Deep neural networks (DNNs) have proven to be efficient for numerous tasks, but come at a high memory and computation cost.
Recent research has shown that their structure can be more compact without compromising their performance.
We present a sparsity-inducing regularization term based on the ratio l1/l2 pseudo-norm defined on the filter coefficients.
arXiv Detail & Related papers (2020-07-20T11:56:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.