Zero-Cost Proxies Meet Differentiable Architecture Search
- URL: http://arxiv.org/abs/2106.06799v1
- Date: Sat, 12 Jun 2021 15:33:36 GMT
- Title: Zero-Cost Proxies Meet Differentiable Architecture Search
- Authors: Lichuan Xiang, {\L}ukasz Dudziak, Mohamed S. Abdelfattah, Thomas Chau,
Nicholas D. Lane, Hongkai Wen
- Abstract summary: Differentiable neural architecture search (NAS) has attracted significant attention in recent years.
Despite its success, DARTS lacks robustness in certain cases.
We propose a novel operation selection paradigm in the context of differentiable NAS.
- Score: 20.957570100784988
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Differentiable neural architecture search (NAS) has attracted significant
attention in recent years due to its ability to quickly discover promising
architectures of deep neural networks even in very large search spaces. Despite
its success, DARTS lacks robustness in certain cases, e.g. it may degenerate to
trivial architectures with excessive parametric-free operations such as skip
connection or random noise, leading to inferior performance. In particular,
operation selection based on the magnitude of architectural parameters was
recently proven to be fundamentally wrong showcasing the need to rethink this
aspect. On the other hand, zero-cost proxies have been recently studied in the
context of sample-based NAS showing promising results -- speeding up the search
process drastically in some cases but also failing on some of the large search
spaces typical for differentiable NAS. In this work we propose a novel
operation selection paradigm in the context of differentiable NAS which
utilises zero-cost proxies. Our perturbation-based zero-cost operation
selection (Zero-Cost-PT) improves searching time and, in many cases, accuracy
compared to the best available differentiable architecture search, regardless
of the search space size. Specifically, we are able to find comparable
architectures to DARTS-PT on the DARTS CNN search space while being over 40x
faster (total searching time 25 minutes on a single GPU).
Related papers
- Generalizable Lightweight Proxy for Robust NAS against Diverse
Perturbations [59.683234126055694]
Recent neural architecture search (NAS) frameworks have been successful in finding optimal architectures for given conditions.
We propose a novel lightweight robust zero-cost proxy that considers the consistency across features, parameters, and gradients of both clean and perturbed images.
Our approach facilitates an efficient and rapid search for neural architectures capable of learning generalizable features that exhibit robustness across diverse perturbations.
arXiv Detail & Related papers (2023-06-08T08:34:26Z) - $\beta$-DARTS: Beta-Decay Regularization for Differentiable Architecture
Search [85.84110365657455]
We propose a simple-but-efficient regularization method, termed as Beta-Decay, to regularize the DARTS-based NAS searching process.
Experimental results on NAS-Bench-201 show that our proposed method can help to stabilize the searching process and makes the searched network more transferable across different datasets.
arXiv Detail & Related papers (2022-03-03T11:47:14Z) - Approximate Neural Architecture Search via Operation Distribution
Learning [4.358626952482686]
We show that given an architectural cell, its performance largely depends on the ratio of used operations.
This intuition is to any specific search strategy and can be applied to a diverse set of NAS algorithms.
arXiv Detail & Related papers (2021-11-08T17:38:29Z) - Making Differentiable Architecture Search less local [9.869449181400466]
Differentiable neural architecture search (DARTS) is a promising NAS approach that dramatically increases search efficiency.
It has been shown to suffer from performance collapse, where the search often leads to detrimental architectures.
We develop a more global optimisation scheme that is able to better explore the space without changing the DARTS problem formulation.
arXiv Detail & Related papers (2021-04-21T10:36:43Z) - BossNAS: Exploring Hybrid CNN-transformers with Block-wisely
Self-supervised Neural Architecture Search [100.28980854978768]
We present Block-wisely Self-supervised Neural Architecture Search (BossNAS)
We factorize the search space into blocks and utilize a novel self-supervised training scheme, named ensemble bootstrapping, to train each block separately.
We also present HyTra search space, a fabric-like hybrid CNN-transformer search space with searchable down-sampling positions.
arXiv Detail & Related papers (2021-03-23T10:05:58Z) - ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse
Coding [86.40042104698792]
We formulate neural architecture search as a sparse coding problem.
In experiments, our two-stage method on CIFAR-10 requires only 0.05 GPU-day for search.
Our one-stage method produces state-of-the-art performances on both CIFAR-10 and ImageNet at the cost of only evaluation time.
arXiv Detail & Related papers (2020-10-13T04:34:24Z) - GOLD-NAS: Gradual, One-Level, Differentiable [100.12492801459105]
We propose a novel algorithm named Gradual One-Level Differentiable Neural Architecture Search (GOLD-NAS)
It introduces a variable resource constraint to one-level optimization so that the weak operators are gradually pruned out from the super-network.
arXiv Detail & Related papers (2020-07-07T10:37:49Z) - DA-NAS: Data Adapted Pruning for Efficient Neural Architecture Search [76.9225014200746]
Efficient search is a core issue in Neural Architecture Search (NAS)
We present DA-NAS that can directly search the architecture for large-scale target tasks while allowing a large candidate set in a more efficient manner.
It is 2x faster than previous methods while the accuracy is currently state-of-the-art, at 76.2% under small FLOPs constraint.
arXiv Detail & Related papers (2020-03-27T17:55:21Z) - NAS evaluation is frustratingly hard [1.7188280334580197]
Neural Architecture Search (NAS) is an exciting new field which promises to be as much as a game-changer as Convolutional Neural Networks were in 2012.
Comparison between different methods is still very much an open issue.
Our first contribution is a benchmark of $8$ NAS methods on $5$ datasets.
arXiv Detail & Related papers (2019-12-28T21:24:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.