ENAS4D: Efficient Multi-stage CNN Architecture Search for Dynamic
Inference
- URL: http://arxiv.org/abs/2009.09182v1
- Date: Sat, 19 Sep 2020 08:08:26 GMT
- Title: ENAS4D: Efficient Multi-stage CNN Architecture Search for Dynamic
Inference
- Authors: Zhihang Yuan, Xin Liu, Bingzhe Wu, Guangyu Sun
- Abstract summary: We introduce a general framework, ENAS4D, which can efficiently search for optimal multi-stage CNN architecture.
Experiments on the ImageNet classification task demonstrate that the multi-stage CNNs searched by ENAS4D consistently outperform the state-of-the-art method for dyanmic inference.
- Score: 12.15628508314535
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dynamic inference is a feasible way to reduce the computational cost of
convolutional neural network(CNN), which can dynamically adjust the computation
for each input sample. One of the ways to achieve dynamic inference is to use
multi-stage neural network, which contains a sub-network with prediction layer
at each stage. The inference of a input sample can exit from early stage if the
prediction of the stage is confident enough. However, design a multi-stage CNN
architecture is a non-trivial task. In this paper, we introduce a general
framework, ENAS4D, which can efficiently search for optimal multi-stage CNN
architecture for dynamic inference in a well-designed search space. Firstly, we
propose a method to construct the search space with multi-stage convolution.
The search space include different numbers of layers, different kernel sizes
and different numbers of channels for each stage and the resolution of input
samples. Then, we train a once-for-all network that supports to sample diverse
multi-stage CNN architecture. A specialized multi-stage network can be obtained
from the once-for-all network without additional training. Finally, we devise a
method to efficiently search for the optimal multi-stage network that trades
the accuracy off the computational cost taking the advantage of once-for-all
network. The experiments on the ImageNet classification task demonstrate that
the multi-stage CNNs searched by ENAS4D consistently outperform the
state-of-the-art method for dyanmic inference. In particular, the network
achieves 74.4% ImageNet top-1 accuracy under 185M average MACs.
Related papers
- Training Convolutional Neural Networks with the Forward-Forward
algorithm [1.74440662023704]
Forward Forward (FF) algorithm has up to now only been used in fully connected networks.
We show how the FF paradigm can be extended to CNNs.
Our FF-trained CNN, featuring a novel spatially-extended labeling technique, achieves a classification accuracy of 99.16% on the MNIST hand-written digits dataset.
arXiv Detail & Related papers (2023-12-22T18:56:35Z) - Diffused Redundancy in Pre-trained Representations [98.55546694886819]
We take a closer look at how features are encoded in pre-trained representations.
We find that learned representations in a given layer exhibit a degree of diffuse redundancy.
Our findings shed light on the nature of representations learned by pre-trained deep neural networks.
arXiv Detail & Related papers (2023-05-31T21:00:50Z) - OFA$^2$: A Multi-Objective Perspective for the Once-for-All Neural
Architecture Search [79.36688444492405]
Once-for-All (OFA) is a Neural Architecture Search (NAS) framework designed to address the problem of searching efficient architectures for devices with different resources constraints.
We aim to give one step further in the search for efficiency by explicitly conceiving the search stage as a multi-objective optimization problem.
arXiv Detail & Related papers (2023-03-23T21:30:29Z) - Towards a General Purpose CNN for Long Range Dependencies in
$\mathrm{N}$D [49.57261544331683]
We propose a single CNN architecture equipped with continuous convolutional kernels for tasks on arbitrary resolution, dimensionality and length without structural changes.
We show the generality of our approach by applying the same CCNN to a wide set of tasks on sequential (1$mathrmD$) and visual data (2$mathrmD$)
Our CCNN performs competitively and often outperforms the current state-of-the-art across all tasks considered.
arXiv Detail & Related papers (2022-06-07T15:48:02Z) - Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z) - Finding Non-Uniform Quantization Schemes using Multi-Task Gaussian
Processes [12.798516310559375]
We show that with significantly lower precision in the last layers we achieve a minimal loss of accuracy with appreciable memory savings.
We test our findings on the CIFAR10 and ImageNet datasets using the VGG, ResNet and GoogLeNet architectures.
arXiv Detail & Related papers (2020-07-15T15:16:18Z) - DC-NAS: Divide-and-Conquer Neural Architecture Search [108.57785531758076]
We present a divide-and-conquer (DC) approach to effectively and efficiently search deep neural architectures.
We achieve a $75.1%$ top-1 accuracy on the ImageNet dataset, which is higher than that of state-of-the-art methods using the same search space.
arXiv Detail & Related papers (2020-05-29T09:02:16Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z) - Fast Neural Network Adaptation via Parameter Remapping and Architecture
Search [35.61441231491448]
Deep neural networks achieve remarkable performance in many computer vision tasks.
Most state-of-the-art (SOTA) semantic segmentation and object detection approaches reuse neural network architectures designed for image classification as the backbone.
One major challenge though, is that ImageNet pre-training of the search space representation incurs huge computational cost.
In this paper, we propose a Fast Neural Network Adaptation (FNA) method, which can adapt both the architecture and parameters of a seed network.
arXiv Detail & Related papers (2020-01-08T13:45:15Z) - Inferring Convolutional Neural Networks' accuracies from their
architectural characterizations [0.0]
We study the relationships between a CNN's architecture and its performance.
We show that the attributes can be predictive of the networks' performance in two specific computer vision-based physics problems.
We use machine learning models to predict whether a network can perform better than a certain threshold accuracy before training.
arXiv Detail & Related papers (2020-01-07T16:41:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.