Efficient Architecture Search via Bi-level Data Pruning
- URL: http://arxiv.org/abs/2312.14200v1
- Date: Thu, 21 Dec 2023 02:48:44 GMT
- Title: Efficient Architecture Search via Bi-level Data Pruning
- Authors: Chongjun Tu, Peng Ye, Weihao Lin, Hancheng Ye, Chong Yu, Tao Chen,
Baopu Li, Wanli Ouyang
- Abstract summary: This work pioneers an exploration into the critical role of dataset characteristics for DARTS bi-level optimization.
We introduce a new progressive data pruning strategy that utilizes supernet prediction dynamics as the metric.
Comprehensive evaluations on the NAS-Bench-201 search space, DARTS search space, and MobileNet-like search space validate that BDP reduces search costs by over 50%.
- Score: 70.29970746807882
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Improving the efficiency of Neural Architecture Search (NAS) is a challenging
but significant task that has received much attention. Previous works mainly
adopted the Differentiable Architecture Search (DARTS) and improved its search
strategies or modules to enhance search efficiency. Recently, some methods have
started considering data reduction for speedup, but they are not tightly
coupled with the architecture search process, resulting in sub-optimal
performance. To this end, this work pioneers an exploration into the critical
role of dataset characteristics for DARTS bi-level optimization, and then
proposes a novel Bi-level Data Pruning (BDP) paradigm that targets the weights
and architecture levels of DARTS to enhance efficiency from a data perspective.
Specifically, we introduce a new progressive data pruning strategy that
utilizes supernet prediction dynamics as the metric, to gradually prune
unsuitable samples for DARTS during the search. An effective automatic class
balance constraint is also integrated into BDP, to suppress potential class
imbalances resulting from data-efficient algorithms. Comprehensive evaluations
on the NAS-Bench-201 search space, DARTS search space, and MobileNet-like
search space validate that BDP reduces search costs by over 50% while achieving
superior performance when applied to baseline DARTS. Besides, we demonstrate
that BDP can harmoniously integrate with advanced DARTS variants, like PC-DARTS
and \b{eta}-DARTS, offering an approximately 2 times speedup with minimal
performance compromises.
Related papers
- OStr-DARTS: Differentiable Neural Architecture Search based on Operation Strength [70.76342136866413]
Differentiable architecture search (DARTS) has emerged as a promising technique for effective neural architecture search.
DARTS suffers from the well-known degeneration issue which can lead to deteriorating architectures.
We propose a novel criterion based on operation strength that estimates the importance of an operation by its effect on the final loss.
arXiv Detail & Related papers (2024-09-22T13:16:07Z) - Heterogeneous Learning Rate Scheduling for Neural Architecture Search on Long-Tailed Datasets [0.0]
We propose a novel adaptive learning rate scheduling strategy tailored for the architecture parameters of DARTS.
Our approach dynamically adjusts the learning rate of the architecture parameters based on the training epoch, preventing the disruption of well-trained representations.
arXiv Detail & Related papers (2024-06-11T07:32:25Z) - IS-DARTS: Stabilizing DARTS through Precise Measurement on Candidate
Importance [41.23462863659102]
DARTS is known for its efficiency and simplicity.
However, performance collapse in DARTS results in deteriorating architectures filled with parameter-free operations.
We propose IS-DARTS to comprehensively improve DARTS and resolve the aforementioned problems.
arXiv Detail & Related papers (2023-12-19T22:45:57Z) - Constructing Tree-based Index for Efficient and Effective Dense
Retrieval [26.706985694158384]
JTR stands for Joint optimization of TRee-based index and query encoding.
We design a new unified contrastive learning loss to train tree-based index and query encoder in an end-to-end manner.
Experimental results show that JTR achieves better retrieval performance while retaining high system efficiency.
arXiv Detail & Related papers (2023-04-24T09:25:39Z) - $\beta$-DARTS: Beta-Decay Regularization for Differentiable Architecture
Search [85.84110365657455]
We propose a simple-but-efficient regularization method, termed as Beta-Decay, to regularize the DARTS-based NAS searching process.
Experimental results on NAS-Bench-201 show that our proposed method can help to stabilize the searching process and makes the searched network more transferable across different datasets.
arXiv Detail & Related papers (2022-03-03T11:47:14Z) - DHA: End-to-End Joint Optimization of Data Augmentation Policy,
Hyper-parameter and Architecture [81.82173855071312]
We propose an end-to-end solution that integrates the AutoML components and returns a ready-to-use model at the end of the search.
Dha achieves state-of-the-art (SOTA) results on various datasets, especially 77.4% accuracy on ImageNet with cell based search space.
arXiv Detail & Related papers (2021-09-13T08:12:50Z) - RARTS: An Efficient First-Order Relaxed Architecture Search Method [5.491655566898372]
Differentiable architecture search (DARTS) is an effective method for data-driven neural network design based on solving a bilevel optimization problem.
We formulate a single level alternative and a relaxed architecture search (RARTS) method that utilizes the whole dataset in architecture learning via both data and network splitting.
For the task of searching topological architecture, i.e., the edges and the operations, RARTS obtains a higher accuracy and 60% reduction of computational cost than second-order DARTS on CIFAR-10.
arXiv Detail & Related papers (2020-08-10T04:55:51Z) - DrNAS: Dirichlet Neural Architecture Search [88.56953713817545]
We treat the continuously relaxed architecture mixing weight as random variables, modeled by Dirichlet distribution.
With recently developed pathwise derivatives, the Dirichlet parameters can be easily optimized with gradient-based generalization.
To alleviate the large memory consumption of differentiable NAS, we propose a simple yet effective progressive learning scheme.
arXiv Detail & Related papers (2020-06-18T08:23:02Z) - Rethinking Performance Estimation in Neural Architecture Search [191.08960589460173]
We provide a novel yet systematic rethinking of performance estimation (PE) in a resource constrained regime.
By combining BPE with various search algorithms including reinforcement learning, evolution algorithm, random search, and differentiable architecture search, we achieve 1, 000x of NAS speed up with a negligible performance drop.
arXiv Detail & Related papers (2020-05-20T09:01:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.