Multigrid-in-Channels Architectures for Wide Convolutional Neural
Networks
- URL: http://arxiv.org/abs/2006.06799v2
- Date: Thu, 19 Nov 2020 18:30:01 GMT
- Title: Multigrid-in-Channels Architectures for Wide Convolutional Neural
Networks
- Authors: Jonathan Ephrath, Lars Ruthotto, Eran Treister
- Abstract summary: We present a multigrid approach that combats the quadratic growth of the number of parameters with respect to the number of channels in standard convolutional neural networks (CNNs)
Our examples from supervised image classification show that applying this strategy to residual networks and MobileNetV2 considerably reduces the number of parameters without negatively affecting accuracy.
- Score: 6.929025509877642
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a multigrid approach that combats the quadratic growth of the
number of parameters with respect to the number of channels in standard
convolutional neural networks (CNNs). It has been shown that there is a
redundancy in standard CNNs, as networks with much sparser convolution
operators can yield similar performance to full networks. The sparsity patterns
that lead to such behavior, however, are typically random, hampering hardware
efficiency. In this work, we present a multigrid-in-channels approach for
building CNN architectures that achieves full coupling of the channels, and
whose number of parameters is linearly proportional to the width of the
network. To this end, we replace each convolution layer in a generic CNN with a
multilevel layer consisting of structured (i.e., grouped) convolutions. Our
examples from supervised image classification show that applying this strategy
to residual networks and MobileNetV2 considerably reduces the number of
parameters without negatively affecting accuracy. Therefore, we can widen
networks without dramatically increasing the number of parameters or
operations.
Related papers
- Group Fisher Pruning for Practical Network Compression [58.25776612812883]
We present a general channel pruning approach that can be applied to various complicated structures.
We derive a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels.
Our method can be used to prune any structures including those with coupled channels.
arXiv Detail & Related papers (2021-08-02T08:21:44Z) - Adversarial Examples in Multi-Layer Random ReLU Networks [39.797621513256026]
adversarial examples arise in ReLU networks with independent gaussian parameters.
Bottleneck layers in the network play a key role: the minimal width up to some point determines scales and sensitivities of mappings computed up to that point.
arXiv Detail & Related papers (2021-06-23T18:16:34Z) - Container: Context Aggregation Network [83.12004501984043]
Recent finding shows that a simple based solution without any traditional convolutional or Transformer components can produce effective visual representations.
We present the model (CONText Ion NERtwok), a general-purpose building block for multi-head context aggregation.
In contrast to Transformer-based methods that do not scale well to downstream tasks that rely on larger input image resolutions, our efficient network, named modellight, can be employed in object detection and instance segmentation networks.
arXiv Detail & Related papers (2021-06-02T18:09:11Z) - PocketNet: A Smaller Neural Network for 3D Medical Image Segmentation [0.0]
We derive a new CNN architecture called PocketNet that achieves comparable segmentation results to conventional CNNs while using less than 3% of the number of parameters.
We show that PocketNet achieves comparable segmentation results to conventional CNNs while using less than 3% of the number of parameters.
arXiv Detail & Related papers (2021-04-21T20:10:30Z) - An Alternative Practice of Tropical Convolution to Traditional
Convolutional Neural Networks [0.5837881923712392]
We propose a new type of CNNs called Tropical Convolutional Neural Networks (TCNNs)
TCNNs are built on tropical convolutions in which the multiplications and additions in conventional convolutional layers are replaced by additions and min/max operations respectively.
We show that TCNN can achieve higher expressive power than ordinary convolutional layers on the MNIST and CIFAR10 image data set.
arXiv Detail & Related papers (2021-03-03T00:13:30Z) - MGIC: Multigrid-in-Channels Neural Network Architectures [8.459177309094688]
We present a multigrid-in-channels approach that tackles the quadratic growth of the number of parameters with respect to the number of channels in standard convolutional neural networks (CNNs)
Our approach addresses the redundancy in CNNs that is also exposed by the recent success of lightweight CNNs.
arXiv Detail & Related papers (2020-11-17T11:29:10Z) - Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks [78.65792427542672]
Dynamic Graph Network (DG-Net) is a complete directed acyclic graph, where the nodes represent convolutional blocks and the edges represent connection paths.
Instead of using the same path of the network, DG-Net aggregates features dynamically in each node, which allows the network to have more representation ability.
arXiv Detail & Related papers (2020-10-02T16:50:26Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.