ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and
Gradient Accumulation
- URL: http://arxiv.org/abs/2011.11233v2
- Date: Thu, 3 Aug 2023 01:44:49 GMT
- Title: ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and
Gradient Accumulation
- Authors: Xiaoxing Wang and Xiangxiang Chu and Yuda Fan and Zhexi Zhang and Bo
Zhang and Xiaokang Yang and Junchi Yan
- Abstract summary: Differentiable architecture search (DARTS) is largely hindered by its substantial memory cost since the entire supernet resides in the memory.
The single-path DARTS comes in, which only chooses a single-path submodel at each step.
While being memory-friendly, it also comes with low computational costs.
We propose a new algorithm called RObustifying Memory-Efficient NAS (ROME) to give a cure.
- Score: 106.04777600352743
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Albeit being a prevalent architecture searching approach, differentiable
architecture search (DARTS) is largely hindered by its substantial memory cost
since the entire supernet resides in the memory. This is where the single-path
DARTS comes in, which only chooses a single-path submodel at each step. While
being memory-friendly, it also comes with low computational costs. Nonetheless,
we discover a critical issue of single-path DARTS that has not been primarily
noticed. Namely, it also suffers from severe performance collapse since too
many parameter-free operations like skip connections are derived, just like
DARTS does. In this paper, we propose a new algorithm called RObustifying
Memory-Efficient NAS (ROME) to give a cure. First, we disentangle the topology
search from the operation search to make searching and evaluation consistent.
We then adopt Gumbel-Top2 reparameterization and gradient accumulation to
robustify the unwieldy bi-level optimization. We verify ROME extensively across
15 benchmarks to demonstrate its effectiveness and robustness.
Related papers
- TopoNAS: Boosting Search Efficiency of Gradient-based NAS via Topological Simplification [11.08910129925713]
TopoNAS is a model-agnostic approach for gradient-based one-shot NAS.
It significantly reduces searching time and memory usage by topological simplification of searchable paths.
arXiv Detail & Related papers (2024-08-02T15:01:29Z) - Generalizing Few-Shot NAS with Gradient Matching [165.5690495295074]
One-Shot methods train one supernet to approximate the performance of every architecture in the search space via weight-sharing.
Few-Shot NAS reduces the level of weight-sharing by splitting the One-Shot supernet into multiple separated sub-supernets.
It significantly outperforms its Few-Shot counterparts while surpassing previous comparable methods in terms of the accuracy of derived architectures.
arXiv Detail & Related papers (2022-03-29T03:06:16Z) - $\beta$-DARTS: Beta-Decay Regularization for Differentiable Architecture
Search [85.84110365657455]
We propose a simple-but-efficient regularization method, termed as Beta-Decay, to regularize the DARTS-based NAS searching process.
Experimental results on NAS-Bench-201 show that our proposed method can help to stabilize the searching process and makes the searched network more transferable across different datasets.
arXiv Detail & Related papers (2022-03-03T11:47:14Z) - ZARTS: On Zero-order Optimization for Neural Architecture Search [94.41017048659664]
Differentiable architecture search (DARTS) has been a popular one-shot paradigm for NAS due to its high efficiency.
This work turns to zero-order optimization and proposes a novel NAS scheme, called ZARTS, to search without enforcing the above approximation.
In particular, results on 12 benchmarks verify the outstanding robustness of ZARTS, where the performance of DARTS collapses due to its known instability issue.
arXiv Detail & Related papers (2021-10-10T09:35:15Z) - Single-DARTS: Towards Stable Architecture Search [7.894638544388165]
We propose Single-DARTS, which merely uses single-level optimization, updating network weights and architecture parameters simultaneously with the same data batch.
Experiment results show that Single-DARTS achieves state-of-the-art performance on mainstream search spaces.
arXiv Detail & Related papers (2021-08-18T13:00:39Z) - Mutually-aware Sub-Graphs Differentiable Architecture Search [29.217547815683748]
Mutually-aware Sub-Graphs Differentiable Architecture Search (MSG-DAS)
MSG-DAS is a differentiable Gumbel-TopK sampler that produces multiple mutually exclusive single-path sub-graphs.
We demonstrate the effectiveness of our methods on ImageNet and CIFAR10, where the searched models show a comparable performance as the most recent approaches.
arXiv Detail & Related papers (2021-07-09T09:31:31Z) - iDARTS: Differentiable Architecture Search with Stochastic Implicit
Gradients [75.41173109807735]
Differentiable ARchiTecture Search (DARTS) has recently become the mainstream of neural architecture search (NAS)
We tackle the hypergradient computation in DARTS based on the implicit function theorem.
We show that the architecture optimisation with the proposed method, named iDARTS, is expected to converge to a stationary point.
arXiv Detail & Related papers (2021-06-21T00:44:11Z) - ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse
Coding [86.40042104698792]
We formulate neural architecture search as a sparse coding problem.
In experiments, our two-stage method on CIFAR-10 requires only 0.05 GPU-day for search.
Our one-stage method produces state-of-the-art performances on both CIFAR-10 and ImageNet at the cost of only evaluation time.
arXiv Detail & Related papers (2020-10-13T04:34:24Z) - NAS evaluation is frustratingly hard [1.7188280334580197]
Neural Architecture Search (NAS) is an exciting new field which promises to be as much as a game-changer as Convolutional Neural Networks were in 2012.
Comparison between different methods is still very much an open issue.
Our first contribution is a benchmark of $8$ NAS methods on $5$ datasets.
arXiv Detail & Related papers (2019-12-28T21:24:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.