GreedyNASv2: Greedier Search with a Greedy Path Filter
- URL: http://arxiv.org/abs/2111.12609v1
- Date: Wed, 24 Nov 2021 16:32:29 GMT
- Title: GreedyNASv2: Greedier Search with a Greedy Path Filter
- Authors: Tao Huang, Shan You, Fei Wang, Chen Qian, Changshui Zhang, Xiaogang
Wang, Chang Xu
- Abstract summary: In one-shot NAS methods, the search space is usually considerably huge (eg, $1321$)
In this paper, we leverage an explicit path filter to capture the characteristics of paths and directly filter those weak ones.
For example, our obtained GreedyNASv2-L achieves $81.1%$ Top-1 accuracy on ImageNet dataset, significantly outperforming the ResNet-50 strong baselines.
- Score: 70.64311838369707
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training a good supernet in one-shot NAS methods is difficult since the
search space is usually considerably huge (\eg, $13^{21}$). In order to enhance
the supernet's evaluation ability, one greedy strategy is to sample good paths,
and let the supernet lean towards the good ones and ease its evaluation burden
as a result. However, in practice the search can be still quite inefficient
since the identification of good paths is not accurate enough and sampled paths
still scatter around the whole search space. In this paper, we leverage an
explicit path filter to capture the characteristics of paths and directly
filter those weak ones, so that the search can be thus implemented on the
shrunk space more greedily and efficiently. Concretely, based on the fact that
good paths are much less than the weak ones in the space, we argue that the
label of ``weak paths" will be more confident and reliable than that of ``good
paths" in multi-path sampling. In this way, we thus cast the training of path
filter in the positive and unlabeled (PU) learning paradigm, and also encourage
a \textit{path embedding} as better path/operation representation to enhance
the identification capacity of the learned filter. By dint of this embedding,
we can further shrink the search space by aggregating similar operations with
similar embeddings, and the search can be more efficient and accurate.
Extensive experiments validate the effectiveness of the proposed method
GreedyNASv2. For example, our obtained GreedyNASv2-L achieves $81.1\%$ Top-1
accuracy on ImageNet dataset, significantly outperforming the ResNet-50 strong
baselines.
Related papers
- Finding Transformer Circuits with Edge Pruning [71.12127707678961]
We propose Edge Pruning as an effective and scalable solution to automated circuit discovery.
Our method finds circuits in GPT-2 that use less than half the number of edges compared to circuits found by previous methods.
Thanks to its efficiency, we scale Edge Pruning to CodeLlama-13B, a model over 100x the scale that prior methods operate on.
arXiv Detail & Related papers (2024-06-24T16:40:54Z) - Random Search as a Baseline for Sparse Neural Network Architecture Search [0.0]
Sparse neural networks have shown similar or better performance than their dense counterparts while having higher parameter efficiency.
This has motivated a number of works to learn or search for high performing sparse networks.
We propose Random Search as a baseline algorithm for finding good sparse configurations and study its performance.
We observe that for this sparse architecture search task, sparse networks found by Random Search neither perform better nor converge more efficiently than their random counterparts.
arXiv Detail & Related papers (2024-03-13T05:32:13Z) - Reducing Redundant Work in Jump Point Search [30.45272181563766]
JPS (Jump Point Search) is a state-of-the-art optimal algorithm for online grid-based pathfinding.
We show that JPS can exhibit pathological behaviours which are not well studied.
We propose a purely online approach, called Constrained JPS (CJPS), to tackle them efficiently.
arXiv Detail & Related papers (2023-06-28T05:21:59Z) - DetOFA: Efficient Training of Once-for-All Networks for Object Detection
Using Path Filter [4.487368901635045]
We propose an efficient supernet-based neural architecture search (NAS) method that uses search space pruning.
Our proposed method reduces the computational cost of the optimal network architecture by 30% and 63%.
arXiv Detail & Related papers (2023-03-23T09:23:11Z) - PA&DA: Jointly Sampling PAth and DAta for Consistent NAS [8.737995937682271]
One-shot NAS methods train a supernet and then inherit the pre-trained weights to evaluate sub-models.
Large gradient variance occurs during supernet training, which degrades the supernet ranking consistency.
We propose to explicitly minimize the gradient variance of the supernet training by jointly optimizing the sampling distributions of PAth and DAta.
arXiv Detail & Related papers (2023-02-28T17:14:24Z) - OPANAS: One-Shot Path Aggregation Network Architecture Search for Object
Detection [82.04372532783931]
Recently, neural architecture search (NAS) has been exploited to design feature pyramid networks (FPNs)
We propose a novel One-Shot Path Aggregation Network Architecture Search (OPANAS) algorithm, which significantly improves both searching efficiency and detection accuracy.
arXiv Detail & Related papers (2021-03-08T01:48:53Z) - ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and
Gradient Accumulation [106.04777600352743]
Differentiable architecture search (DARTS) is largely hindered by its substantial memory cost since the entire supernet resides in the memory.
The single-path DARTS comes in, which only chooses a single-path submodel at each step.
While being memory-friendly, it also comes with low computational costs.
We propose a new algorithm called RObustifying Memory-Efficient NAS (ROME) to give a cure.
arXiv Detail & Related papers (2020-11-23T06:34:07Z) - ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse
Coding [86.40042104698792]
We formulate neural architecture search as a sparse coding problem.
In experiments, our two-stage method on CIFAR-10 requires only 0.05 GPU-day for search.
Our one-stage method produces state-of-the-art performances on both CIFAR-10 and ImageNet at the cost of only evaluation time.
arXiv Detail & Related papers (2020-10-13T04:34:24Z) - GreedyNAS: Towards Fast One-Shot NAS with Greedy Supernet [63.96959854429752]
GreedyNAS is easy-to-follow, and experimental results on ImageNet dataset indicate that it can achieve better Top-1 accuracy under same search space and FLOPs or latency level.
By searching on a larger space, our GreedyNAS can also obtain new state-of-the-art architectures.
arXiv Detail & Related papers (2020-03-25T06:54:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.