Model Rubik's Cube: Twisting Resolution, Depth and Width for TinyNets
- URL: http://arxiv.org/abs/2010.14819v2
- Date: Thu, 24 Dec 2020 09:21:04 GMT
- Title: Model Rubik's Cube: Twisting Resolution, Depth and Width for TinyNets
- Authors: Kai Han, Yunhe Wang, Qiulin Zhang, Wei Zhang, Chunjing Xu, Tong Zhang
- Abstract summary: Giant formula for simultaneously enlarging the resolution, depth and width provides us a Rubik's cube for neural networks.
This paper aims to explore the twisting rules for obtaining deep neural networks with minimum model sizes and computational costs.
- Score: 65.28292822614418
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To obtain excellent deep neural architectures, a series of techniques are
carefully designed in EfficientNets. The giant formula for simultaneously
enlarging the resolution, depth and width provides us a Rubik's cube for neural
networks. So that we can find networks with high efficiency and excellent
performance by twisting the three dimensions. This paper aims to explore the
twisting rules for obtaining deep neural networks with minimum model sizes and
computational costs. Different from the network enlarging, we observe that
resolution and depth are more important than width for tiny networks.
Therefore, the original method, i.e., the compound scaling in EfficientNet is
no longer suitable. To this end, we summarize a tiny formula for downsizing
neural architectures through a series of smaller models derived from the
EfficientNet-B0 with the FLOPs constraint. Experimental results on the ImageNet
benchmark illustrate that our TinyNet performs much better than the smaller
version of EfficientNets using the inversed giant formula. For instance, our
TinyNet-E achieves a 59.9% Top-1 accuracy with only 24M FLOPs, which is about
1.9% higher than that of the previous best MobileNetV3 with similar
computational cost. Code will be available at
https://github.com/huawei-noah/ghostnet/tree/master/tinynet_pytorch, and
https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/cv/tinynet.
Related papers
- MogaNet: Multi-order Gated Aggregation Network [64.16774341908365]
We propose a new family of modern ConvNets, dubbed MogaNet, for discriminative visual representation learning.
MogaNet encapsulates conceptually simple yet effective convolutions and gated aggregation into a compact module.
MogaNet exhibits great scalability, impressive efficiency of parameters, and competitive performance compared to state-of-the-art ViTs and ConvNets on ImageNet.
arXiv Detail & Related papers (2022-11-07T04:31:17Z) - Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs [148.0476219278875]
We revisit large kernel design in modern convolutional neural networks (CNNs)
Inspired by recent advances of vision transformers (ViTs), in this paper, we demonstrate that using a few large convolutional kernels instead of a stack of small kernels could be a more powerful paradigm.
We propose RepLKNet, a pure CNN architecture whose kernel size is as large as 31x31, in contrast to commonly used 3x3.
arXiv Detail & Related papers (2022-03-13T17:22:44Z) - Hidden-Fold Networks: Random Recurrent Residuals Using Sparse Supermasks [1.0814638303152528]
Deep neural networks (DNNs) are so over-parametrized that recent research has found them to contain a subnetwork with high accuracy.
This paper proposes blending these lines of research into a highly compressed yet accurate model: Hidden-Fold Networks (HFNs)
It achieves equivalent performance to ResNet50 on CIFAR100 while occupying 38.5x less memory, and similar performance to ResNet34 on ImageNet with a memory size 26.8x smaller.
arXiv Detail & Related papers (2021-11-24T08:24:31Z) - Network Augmentation for Tiny Deep Learning [73.57192520534585]
We introduce Network Augmentation (NetAug), a new training method for improving the performance of tiny neural networks.
We demonstrate the effectiveness of NetAug on image classification and object detection.
arXiv Detail & Related papers (2021-10-17T18:48:41Z) - ThresholdNet: Pruning Tool for Densely Connected Convolutional Networks [2.267411144256508]
We introduce a new type of pruning tool, threshold, which refers to the principle of the threshold voltage in terms of memory.
This work employs this method to connect blocks of different depths in different ways to reduce the usage of memory.
Experiments show that HarDNet is twice as fast as DenseNet, and on this basis, ThresholdNet is 10% faster and 10% lower error rate than HarDNet.
arXiv Detail & Related papers (2021-08-28T08:48:31Z) - DRU-net: An Efficient Deep Convolutional Neural Network for Medical
Image Segmentation [2.3574651879602215]
Residual network (ResNet) and densely connected network (DenseNet) have significantly improved the training efficiency and performance of deep convolutional neural networks (DCNNs)
We propose an efficient network architecture by considering advantages of both networks.
arXiv Detail & Related papers (2020-04-28T12:16:24Z) - FBNetV2: Differentiable Neural Architecture Search for Spatial and
Channel Dimensions [70.59851564292828]
Differentiable Neural Architecture Search (DNAS) has demonstrated great success in designing state-of-the-art, efficient neural networks.
We propose a memory and computationally efficient DNAS variant: DMaskingNAS.
This algorithm expands the search space by up to $1014times$ over conventional DNAS.
arXiv Detail & Related papers (2020-04-12T08:52:15Z) - MeliusNet: Can Binary Neural Networks Achieve MobileNet-level Accuracy? [12.050205584630922]
Binary Neural Networks (BNNs) are neural networks which use binary weights and activations instead of the typical 32-bit floating point values.
In this paper, we present an architectural approach: MeliusNet. It consists of alternating a DenseBlock, which increases the feature capacity, and our proposed ImprovementBlock, which increases the feature quality.
arXiv Detail & Related papers (2020-01-16T16:56:10Z) - AdderNet: Do We Really Need Multiplications in Deep Learning? [159.174891462064]
We present adder networks (AdderNets) to trade massive multiplications in deep neural networks for much cheaper additions to reduce computation costs.
We develop a special back-propagation approach for AdderNets by investigating the full-precision gradient.
As a result, the proposed AdderNets can achieve 74.9% Top-1 accuracy 91.7% Top-5 accuracy using ResNet-50 on the ImageNet dataset.
arXiv Detail & Related papers (2019-12-31T06:56:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.