Related papers: Prioritized Architecture Sampling with Monto-Carlo Tree Search

Prioritized Architecture Sampling with Monto-Carlo Tree Search

URL: http://arxiv.org/abs/2103.11922v1
Date: Mon, 22 Mar 2021 15:09:29 GMT
Title: Prioritized Architecture Sampling with Monto-Carlo Tree Search
Authors: Xiu Su, Tao Huang, Yanxi Li, Shan You, Fei Wang, Chen Qian, Changshui Zhang, Chang Xu
Abstract summary: One-shot neural architecture search (NAS) methods significantly reduce the search cost by considering the whole search space as one network. In this paper, we introduce a sampling strategy based on Monte Carlo tree search (MCTS) with the search space modeled as a Monte Carlo tree (MCT) For a fair comparison, we construct an open-source NAS benchmark of a macro search space evaluated on CIFAR-10, namely NAS-Bench-Macro.
Score: 54.72096546595955
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: One-shot neural architecture search (NAS) methods significantly reduce the search cost by considering the whole search space as one network, which only needs to be trained once. However, current methods select each operation independently without considering previous layers. Besides, the historical information obtained with huge computation cost is usually used only once and then discarded. In this paper, we introduce a sampling strategy based on Monte Carlo tree search (MCTS) with the search space modeled as a Monte Carlo tree (MCT), which captures the dependency among layers. Furthermore, intermediate results are stored in the MCT for the future decision and a better exploration-exploitation balance. Concretely, MCT is updated using the training loss as a reward to the architecture performance; for accurately evaluating the numerous nodes, we propose node communication and hierarchical node selection methods in the training and search stages, respectively, which make better uses of the operation rewards and hierarchical information. Moreover, for a fair comparison of different NAS methods, we construct an open-source NAS benchmark of a macro search space evaluated on CIFAR-10, namely NAS-Bench-Macro. Extensive experiments on NAS-Bench-Macro and ImageNet demonstrate that our method significantly improves search efficiency and performance. For example, by only searching $20$ architectures, our obtained architecture achieves $78.0\%$ top-1 accuracy with 442M FLOPs on ImageNet. Code (Benchmark) is available at: \url{https://github.com/xiusu/NAS-Bench-Macro}.

Related papers

Neural Architecture Search by Learning a Hierarchical Search Space [7.154749344931972]
Monte-Carlo Tree Search (MCTS) is a powerful tool for many non-differentiable search related problems such as adversarial games. In Neural Architecture Search (NAS), as only the final architecture matters, the visiting order of the branching can be optimized to improve learning. We propose to learn the branching by hierarchical clustering of architectures based on their similarity.
arXiv Detail & Related papers (2025-03-27T00:20:13Z)
LayerNAS: Neural Architecture Search in Polynomial Complexity [18.36070437021082]
We propose LayerNAS to address the challenge of multi-objective NAS. LayerNAS groups model candidates based on one objective, such as model size or latency, and searches for the optimal model based on another objective. Our experiments show that LayerNAS is able to consistently discover superior models across a variety of search spaces.
arXiv Detail & Related papers (2023-04-23T02:08:00Z)
When NAS Meets Trees: An Efficient Algorithm for Neural Architecture Search [117.89827740405694]
Key challenge in neural architecture search (NAS) is designing how to explore wisely in the huge search space. We propose a new NAS method called TNAS (NAS with trees), which improves search efficiency by exploring only a small number of architectures. TNAS finds the global optimal architecture on CIFAR-10 with test accuracy of 94.37% in four GPU hours in NAS-Bench-201.
arXiv Detail & Related papers (2022-04-11T07:34:21Z)
BaLeNAS: Differentiable Architecture Search via the Bayesian Learning Rule [95.56873042777316]
Differentiable Architecture Search (DARTS) has received massive attention in recent years, mainly because it significantly reduces the computational cost. This paper formulates the neural architecture search as a distribution learning problem through relaxing the architecture weights into Gaussian distributions. We demonstrate how the differentiable NAS benefits from Bayesian principles, enhancing exploration and improving stability.
arXiv Detail & Related papers (2021-11-25T18:13:42Z)
Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets [42.993720854755736]
We propose an efficient Neural Search (NAS) framework that is trained once on a database consisting of datasets and pretrained networks. We show that our model meta-learned on subsets of ImageNet-1K and architectures from NAS-Bench 201 search space successfully generalizes to multiple unseen datasets.
arXiv Detail & Related papers (2021-07-02T06:33:59Z)
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search [100.28980854978768]
We present Block-wisely Self-supervised Neural Architecture Search (BossNAS) We factorize the search space into blocks and utilize a novel self-supervised training scheme, named ensemble bootstrapping, to train each block separately. We also present HyTra search space, a fabric-like hybrid CNN-transformer search space with searchable down-sampling positions.
arXiv Detail & Related papers (2021-03-23T10:05:58Z)
Neural Architecture Search via Combinatorial Multi-Armed Bandit [43.29214413461234]
We formulate NAS as a Combinatorial Multi-Armed Bandit (CMAB) problem (CMAB-NAS) This allows the decomposition of a large search space into smaller blocks where tree-search methods can be applied more effectively and efficiently. We leverage a tree-based method called Nested Monte-Carlo Search to tackle the CMAB-NAS problem. On CIFAR-10, our approach discovers a cell structure that achieves a low error rate that is comparable to the state-of-the-art, using only 0.58 GPU days.
arXiv Detail & Related papers (2021-01-01T23:29:33Z)
ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding [86.40042104698792]
We formulate neural architecture search as a sparse coding problem. In experiments, our two-stage method on CIFAR-10 requires only 0.05 GPU-day for search. Our one-stage method produces state-of-the-art performances on both CIFAR-10 and ImageNet at the cost of only evaluation time.
arXiv Detail & Related papers (2020-10-13T04:34:24Z)
Fine-Grained Stochastic Architecture Search [6.277767522867666]
Fine-Grained Architecture Search (FiGS) is a differentiable search method that searches over a much larger set of candidate architectures. FiGS simultaneously selects and modifies operators in the search space by applying a structured sparse regularization penalty. We show results across 3 existing search spaces, matching or outperforming the original search algorithms.
arXiv Detail & Related papers (2020-06-17T01:04:14Z)
NAS evaluation is frustratingly hard [1.7188280334580197]
Neural Architecture Search (NAS) is an exciting new field which promises to be as much as a game-changer as Convolutional Neural Networks were in 2012. Comparison between different methods is still very much an open issue. Our first contribution is a benchmark of $8$ NAS methods on $5$ datasets.
arXiv Detail & Related papers (2019-12-28T21:24:12Z)
DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning [135.27931587381596]
We propose an efficient and unified NAS framework termed DDPNAS via dynamic distribution pruning. In particular, we first sample architectures from a joint categorical distribution. Then the search space is dynamically pruned and its distribution is updated every few epochs. With the proposed efficient network generation method, we directly obtain the optimal neural architectures on given constraints.
arXiv Detail & Related papers (2019-05-28T06:35:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.