MGIC: Multigrid-in-Channels Neural Network Architectures
- URL: http://arxiv.org/abs/2011.09128v3
- Date: Sat, 7 Aug 2021 09:39:55 GMT
- Title: MGIC: Multigrid-in-Channels Neural Network Architectures
- Authors: Moshe Eliasof, Jonathan Ephrath, Lars Ruthotto, Eran Treister
- Abstract summary: We present a multigrid-in-channels approach that tackles the quadratic growth of the number of parameters with respect to the number of channels in standard convolutional neural networks (CNNs)
Our approach addresses the redundancy in CNNs that is also exposed by the recent success of lightweight CNNs.
- Score: 8.459177309094688
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a multigrid-in-channels (MGIC) approach that tackles the quadratic
growth of the number of parameters with respect to the number of channels in
standard convolutional neural networks (CNNs). Thereby our approach addresses
the redundancy in CNNs that is also exposed by the recent success of
lightweight CNNs. Lightweight CNNs can achieve comparable accuracy to standard
CNNs with fewer parameters; however, the number of weights still scales
quadratically with the CNN's width. Our MGIC architectures replace each CNN
block with an MGIC counterpart that utilizes a hierarchy of nested grouped
convolutions of small group size to address this.
Hence, our proposed architectures scale linearly with respect to the
network's width while retaining full coupling of the channels as in standard
CNNs.
Our extensive experiments on image classification, segmentation, and point
cloud classification show that applying this strategy to different
architectures like ResNet and MobileNetV3 reduces the number of parameters
while obtaining similar or better accuracy.
Related papers
- Model Parallel Training and Transfer Learning for Convolutional Neural Networks by Domain Decomposition [0.0]
Deep convolutional neural networks (CNNs) have been shown to be very successful in a wide range of image processing applications.
Due to their increasing number of model parameters and an increasing availability of large amounts of training data, parallelization strategies to efficiently train complex CNNs are necessary.
arXiv Detail & Related papers (2024-08-26T17:35:01Z) - OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation [70.17681136234202]
We reexamine the design distinctions and test the limits of what a sparse CNN can achieve.
We propose two key components, i.e., adaptive receptive fields (spatially) and adaptive relation, to bridge the gap.
This exploration led to the creation of Omni-Adaptive 3D CNNs (OA-CNNs), a family of networks that integrates a lightweight module.
arXiv Detail & Related papers (2024-03-21T14:06:38Z) - A heterogeneous group CNN for image super-resolution [127.2132400582117]
Convolutional neural networks (CNNs) have obtained remarkable performance via deep architectures.
We present a heterogeneous group SR CNN (HGSRCNN) via leveraging structure information of different types to obtain a high-quality image.
arXiv Detail & Related papers (2022-09-26T04:14:59Z) - AutoDiCE: Fully Automated Distributed CNN Inference at the Edge [0.9883261192383613]
We propose a novel framework, called AutoDiCE, for automated splitting of a CNN model into a set of sub-models.
Our experimental results show that AutoDiCE can deliver distributed CNN inference with reduced energy consumption and memory usage per edge device.
arXiv Detail & Related papers (2022-07-20T15:08:52Z) - Exploiting Hybrid Models of Tensor-Train Networks for Spoken Command
Recognition [9.262289183808035]
This work aims to design a low complexity spoken command recognition (SCR) system.
We exploit a deep hybrid architecture of a tensor-train (TT) network to build an end-to-end SRC pipeline.
Our proposed CNN+(TT-DNN) model attains a competitive accuracy of 96.31% with 4 times fewer model parameters than the CNN model.
arXiv Detail & Related papers (2022-01-11T05:57:38Z) - The Mind's Eye: Visualizing Class-Agnostic Features of CNNs [92.39082696657874]
We propose an approach to visually interpret CNN features given a set of images by creating corresponding images that depict the most informative features of a specific layer.
Our method uses a dual-objective activation and distance loss, without requiring a generator network nor modifications to the original model.
arXiv Detail & Related papers (2021-01-29T07:46:39Z) - Rescaling CNN through Learnable Repetition of Network Parameters [2.137666194897132]
We present a novel rescaling strategy for CNNs based on learnable repetition of its parameters.
We show that small base networks when rescaled, can provide performance comparable to deeper networks with as low as 6% of optimization parameters of the deeper one.
arXiv Detail & Related papers (2021-01-14T15:03:25Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - Exploring Deep Hybrid Tensor-to-Vector Network Architectures for
Regression Based Speech Enhancement [53.47564132861866]
We find that a hybrid architecture, namely CNN-TT, is capable of maintaining a good quality performance with a reduced model parameter size.
CNN-TT is composed of several convolutional layers at the bottom for feature extraction to improve speech quality.
arXiv Detail & Related papers (2020-07-25T22:21:05Z) - Multigrid-in-Channels Architectures for Wide Convolutional Neural
Networks [6.929025509877642]
We present a multigrid approach that combats the quadratic growth of the number of parameters with respect to the number of channels in standard convolutional neural networks (CNNs)
Our examples from supervised image classification show that applying this strategy to residual networks and MobileNetV2 considerably reduces the number of parameters without negatively affecting accuracy.
arXiv Detail & Related papers (2020-06-11T20:28:36Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.