Lets keep it simple, Using simple architectures to outperform deeper and
more complex architectures
- URL: http://arxiv.org/abs/1608.06037v8
- Date: Thu, 27 Apr 2023 16:20:03 GMT
- Title: Lets keep it simple, Using simple architectures to outperform deeper and
more complex architectures
- Authors: Seyyed Hossein Hasanpour, Mohammad Rouhani, Mohsen Fayyaz, Mohammad
Sabokrou
- Abstract summary: Convolutional Neural Networks (CNNs) include tens to hundreds of millions of parameters, which impose considerable computation and memory overhead.
We propose a simple architecture called SimpleNet, based on a set of designing principles, with which we empirically show, a well-crafted yet simple and reasonably deep architecture can perform on par with deeper and more complex architectures.
Our simple 13-layer architecture outperforms most of the deeper and complex architectures to date such as VGGNet, ResNet, and GoogleNet on several well-known benchmarks.
- Score: 12.76864681474486
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Major winning Convolutional Neural Networks (CNNs), such as AlexNet, VGGNet,
ResNet, GoogleNet, include tens to hundreds of millions of parameters, which
impose considerable computation and memory overhead. This limits their
practical use for training, optimization and memory efficiency. On the
contrary, light-weight architectures, being proposed to address this issue,
mainly suffer from low accuracy. These inefficiencies mostly stem from
following an ad hoc procedure. We propose a simple architecture, called
SimpleNet, based on a set of designing principles, with which we empirically
show, a well-crafted yet simple and reasonably deep architecture can perform on
par with deeper and more complex architectures. SimpleNet provides a good
tradeoff between the computation/memory efficiency and the accuracy. Our simple
13-layer architecture outperforms most of the deeper and complex architectures
to date such as VGGNet, ResNet, and GoogleNet on several well-known benchmarks
while having 2 to 25 times fewer number of parameters and operations. This
makes it very handy for embedded systems or systems with computational and
memory limitations. We achieved state-of-the-art result on CIFAR10
outperforming several heavier architectures, near state of the art on MNIST and
competitive results on CIFAR100 and SVHN. We also outperformed the much larger
and deeper architectures such as VGGNet and popular variants of ResNets among
others on the ImageNet dataset. Models are made available at:
https://github.com/Coderx7/SimpleNet
Related papers
- Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - Simple and Efficient Architectures for Semantic Segmentation [50.1563637917129]
We show that a simple encoder-decoder architecture with a ResNet-like backbone and a small multi-scale head, performs on-par or better than complex semantic segmentation architectures such as HRNet, FANet and DDRNet.
We present a family of such simple architectures for desktop as well as mobile targets, which match or exceed the performance of complex models on the Cityscapes dataset.
arXiv Detail & Related papers (2022-06-16T15:08:34Z) - ThreshNet: An Efficient DenseNet using Threshold Mechanism to Reduce
Connections [1.2542322096299672]
We propose a new network architecture using threshold mechanism to further optimize the method of connections.
ThreshNet achieves up to 60% reduction in inference time compared to DenseNet, and up to 35% faster training speed and 20% reduction in error rate.
arXiv Detail & Related papers (2022-01-09T13:52:16Z) - Rethinking Architecture Selection in Differentiable NAS [74.61723678821049]
Differentiable Neural Architecture Search is one of the most popular NAS methods for its search efficiency and simplicity.
We propose an alternative perturbation-based architecture selection that directly measures each operation's influence on the supernet.
We find that several failure modes of DARTS can be greatly alleviated with the proposed selection method.
arXiv Detail & Related papers (2021-08-10T00:53:39Z) - Weak NAS Predictors Are All You Need [91.11570424233709]
Recent predictor-based NAS approaches attempt to solve the problem with two key steps: sampling some architecture-performance pairs and fitting a proxy accuracy predictor.
We shift the paradigm from finding a complicated predictor that covers the whole architecture space to a set of weaker predictors that progressively move towards the high-performance sub-space.
Our method costs fewer samples to find the top-performance architectures on NAS-Bench-101 and NAS-Bench-201, and it achieves the state-of-the-art ImageNet performance on the NASNet search space.
arXiv Detail & Related papers (2021-02-21T01:58:43Z) - Structured Convolutions for Efficient Neural Network Design [65.36569572213027]
We tackle model efficiency by exploiting redundancy in the textitimplicit structure of the building blocks of convolutional neural networks.
We show how this decomposition can be applied to 2D and 3D kernels as well as the fully-connected layers.
arXiv Detail & Related papers (2020-08-06T04:38:38Z) - Neural Architecture Design for GPU-Efficient Networks [27.07089149328155]
We propose a general principle for designing GPU-efficient networks based on extensive empirical studies.
Based on the proposed framework, we design a family of GPU-Efficient Networks, or GENets in short.
While achieving $geq 81.3%$ top-1 accuracy on ImageNet, GENet is up to $6.4$ times faster than EfficienNet on GPU.
arXiv Detail & Related papers (2020-06-24T22:42:18Z) - Stage-Wise Neural Architecture Search [65.03109178056937]
Modern convolutional networks such as ResNet and NASNet have achieved state-of-the-art results in many computer vision applications.
These networks consist of stages, which are sets of layers that operate on representations in the same resolution.
It has been demonstrated that increasing the number of layers in each stage improves the prediction ability of the network.
However, the resulting architecture becomes computationally expensive in terms of floating point operations, memory requirements and inference time.
arXiv Detail & Related papers (2020-04-23T14:16:39Z) - Lightweight Residual Densely Connected Convolutional Neural Network [18.310331378001397]
The lightweight residual densely connected blocks are proposed to guaranty the deep supervision, efficient gradient flow, and feature reuse abilities of convolutional neural network.
The proposed method decreases the cost of training and inference processes without using any special hardware-software equipment.
arXiv Detail & Related papers (2020-01-02T17:15:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.