NASTransfer: Analyzing Architecture Transferability in Large Scale
Neural Architecture Search
- URL: http://arxiv.org/abs/2006.13314v2
- Date: Fri, 12 Feb 2021 02:55:35 GMT
- Title: NASTransfer: Analyzing Architecture Transferability in Large Scale
Neural Architecture Search
- Authors: Rameswar Panda, Michele Merler, Mayoore Jaiswal, Hui Wu, Kandan
Ramakrishnan, Ulrich Finkler, Chun-Fu Chen, Minsik Cho, David Kung, Rogerio
Feris, Bishwaranjan Bhattacharjee
- Abstract summary: Neural Architecture Search (NAS) is an open and challenging problem in machine learning.
The typical way of conducting large scale NAS is to search for an architectural building block on a small dataset and then transfer the block to a larger dataset.
We analyze the architecture transferability of different NAS methods by performing a series of experiments on large scale benchmarks such as ImageNet1K and ImageNet22K.
- Score: 18.77097100500467
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural Architecture Search (NAS) is an open and challenging problem in
machine learning. While NAS offers great promise, the prohibitive computational
demand of most of the existing NAS methods makes it difficult to directly
search the architectures on large-scale tasks. The typical way of conducting
large scale NAS is to search for an architectural building block on a small
dataset (either using a proxy set from the large dataset or a completely
different small scale dataset) and then transfer the block to a larger dataset.
Despite a number of recent results that show the promise of transfer from proxy
datasets, a comprehensive evaluation of different NAS methods studying the
impact of different source datasets has not yet been addressed. In this work,
we propose to analyze the architecture transferability of different NAS methods
by performing a series of experiments on large scale benchmarks such as
ImageNet1K and ImageNet22K. We find that: (i) The size and domain of the proxy
set does not seem to influence architecture performance on the target dataset.
On average, transfer performance of architectures searched using completely
different small datasets (e.g., CIFAR10) perform similarly to the architectures
searched directly on proxy target datasets. However, design of proxy sets has
considerable impact on rankings of different NAS methods. (ii) While different
NAS methods show similar performance on a source dataset (e.g., CIFAR10), they
significantly differ on the transfer performance to a large dataset (e.g.,
ImageNet1K). (iii) Even on large datasets, random sampling baseline is very
competitive, but the choice of the appropriate combination of proxy set and
search strategy can provide significant improvement over it. We believe that
our extensive empirical analysis will prove useful for future design of NAS
algorithms.
Related papers
- Fair Differentiable Neural Network Architecture Search for Long-Tailed Data with Self-Supervised Learning [0.0]
This paper explores to improve the searching and training performance of NAS on long-tailed datasets.
We first discuss the related works about NAS and the deep learning method for long-tailed datasets.
Then, we focus on an existing work, called SSF-NAS, which integrates the self-supervised learning and fair differentiable NAS.
Finally, we conducted a series of experiments on the CIFAR10-LT dataset for performance evaluation.
arXiv Detail & Related papers (2024-06-19T12:39:02Z) - UnrealNAS: Can We Search Neural Architectures with Unreal Data? [84.78460976605425]
Neural architecture search (NAS) has shown great success in the automatic design of deep neural networks (DNNs)
Previous work has analyzed the necessity of having ground-truth labels in NAS and inspired broad interest.
We take a further step to question whether real data is necessary for NAS to be effective.
arXiv Detail & Related papers (2022-05-04T16:30:26Z) - BaLeNAS: Differentiable Architecture Search via the Bayesian Learning
Rule [95.56873042777316]
Differentiable Architecture Search (DARTS) has received massive attention in recent years, mainly because it significantly reduces the computational cost.
This paper formulates the neural architecture search as a distribution learning problem through relaxing the architecture weights into Gaussian distributions.
We demonstrate how the differentiable NAS benefits from Bayesian principles, enhancing exploration and improving stability.
arXiv Detail & Related papers (2021-11-25T18:13:42Z) - Rapid Neural Architecture Search by Learning to Generate Graphs from
Datasets [42.993720854755736]
We propose an efficient Neural Search (NAS) framework that is trained once on a database consisting of datasets and pretrained networks.
We show that our model meta-learned on subsets of ImageNet-1K and architectures from NAS-Bench 201 search space successfully generalizes to multiple unseen datasets.
arXiv Detail & Related papers (2021-07-02T06:33:59Z) - Accelerating Neural Architecture Search via Proxy Data [17.86463546971522]
We propose a novel proxy data selection method tailored for neural architecture search (NAS)
executing DARTS with the proposed selection requires only 40 minutes on CIFAR-10 and 7.5 hours on ImageNet with a single GPU.
When the architecture searched on ImageNet using the proposed selection is inversely transferred to CIFAR-10, a state-of-the-art test error of 2.4% is yielded.
arXiv Detail & Related papers (2021-06-09T03:08:53Z) - Weak NAS Predictors Are All You Need [91.11570424233709]
Recent predictor-based NAS approaches attempt to solve the problem with two key steps: sampling some architecture-performance pairs and fitting a proxy accuracy predictor.
We shift the paradigm from finding a complicated predictor that covers the whole architecture space to a set of weaker predictors that progressively move towards the high-performance sub-space.
Our method costs fewer samples to find the top-performance architectures on NAS-Bench-101 and NAS-Bench-201, and it achieves the state-of-the-art ImageNet performance on the NASNet search space.
arXiv Detail & Related papers (2021-02-21T01:58:43Z) - Stage-Wise Neural Architecture Search [65.03109178056937]
Modern convolutional networks such as ResNet and NASNet have achieved state-of-the-art results in many computer vision applications.
These networks consist of stages, which are sets of layers that operate on representations in the same resolution.
It has been demonstrated that increasing the number of layers in each stage improves the prediction ability of the network.
However, the resulting architecture becomes computationally expensive in terms of floating point operations, memory requirements and inference time.
arXiv Detail & Related papers (2020-04-23T14:16:39Z) - DA-NAS: Data Adapted Pruning for Efficient Neural Architecture Search [76.9225014200746]
Efficient search is a core issue in Neural Architecture Search (NAS)
We present DA-NAS that can directly search the architecture for large-scale target tasks while allowing a large candidate set in a more efficient manner.
It is 2x faster than previous methods while the accuracy is currently state-of-the-art, at 76.2% under small FLOPs constraint.
arXiv Detail & Related papers (2020-03-27T17:55:21Z) - NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture
Search [55.12928953187342]
We propose an extension to NAS-Bench-101: NAS-Bench-201 with a different search space, results on multiple datasets, and more diagnostic information.
NAS-Bench-201 has a fixed search space and provides a unified benchmark for almost any up-to-date NAS algorithms.
We provide additional diagnostic information such as fine-grained loss and accuracy, which can give inspirations to new designs of NAS algorithms.
arXiv Detail & Related papers (2020-01-02T05:28:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.