Related papers: Stabilizing Differentiable Architecture Search via Perturbation-based Regularization

Stabilizing Differentiable Architecture Search via Perturbation-based Regularization

URL: http://arxiv.org/abs/2002.05283v3
Date: Tue, 12 Jan 2021 19:17:24 GMT
Title: Stabilizing Differentiable Architecture Search via Perturbation-based Regularization
Authors: Xiangning Chen, Cho-Jui Hsieh
Abstract summary: We find that the precipitous validation loss landscape, which leads to a dramatic performance drop when distilling the final architecture, is an essential factor that causes instability. We propose a perturbation-based regularization - SmoothDARTS (SDARTS) - to smooth the loss landscape and improve the generalizability of DARTS-based methods.
Score: 99.81980366552408
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Differentiable architecture search (DARTS) is a prevailing NAS solution to identify architectures. Based on the continuous relaxation of the architecture space, DARTS learns a differentiable architecture weight and largely reduces the search cost. However, its stability has been challenged for yielding deteriorating architectures as the search proceeds. We find that the precipitous validation loss landscape, which leads to a dramatic performance drop when distilling the final architecture, is an essential factor that causes instability. Based on this observation, we propose a perturbation-based regularization - SmoothDARTS (SDARTS), to smooth the loss landscape and improve the generalizability of DARTS-based methods. In particular, our new formulations stabilize DARTS-based methods by either random smoothing or adversarial attack. The search trajectory on NAS-Bench-1Shot1 demonstrates the effectiveness of our approach and due to the improved stability, we achieve performance gain across various search spaces on 4 datasets. Furthermore, we mathematically show that SDARTS implicitly regularizes the Hessian norm of the validation loss, which accounts for a smoother loss landscape and improved performance.

Related papers

Regularizing Differentiable Architecture Search with Smooth Activation [10.658697052636272]
Differentiable Architecture Search (DARTS) is an efficient Neural Architecture Search (NAS) method but suffers from robustness, generalization, and discrepancy issues. We propose Smooth Activation DARTS (SA-DARTS) to overcome skip dominance and discretization discrepancy challenges. We show that SA-DARTS can help improve the performance of SOTA models with fewer parameters, such as Information Multi-distillation Network on the super-resolution task.
arXiv Detail & Related papers (2025-04-22T22:49:38Z)
LT-DARTS: An Architectural Approach to Enhance Deep Long-Tailed Learning [5.214135587370722]
We introduce Long-Tailed Differential Architecture Search (LT-DARTS) We conduct extensive experiments to explore architectural components that demonstrate better performance on long-tailed data. This ensures that the architecture obtained through our search process incorporates superior components.
arXiv Detail & Related papers (2024-11-09T07:19:56Z)
OStr-DARTS: Differentiable Neural Architecture Search based on Operation Strength [70.76342136866413]
Differentiable architecture search (DARTS) has emerged as a promising technique for effective neural architecture search. DARTS suffers from the well-known degeneration issue which can lead to deteriorating architectures. We propose a novel criterion based on operation strength that estimates the importance of an operation by its effect on the final loss.
arXiv Detail & Related papers (2024-09-22T13:16:07Z)
$\eta$-DARTS++: Bi-level Regularization for Proxy-robust Differentiable Architecture Search [96.99525100285084]
Regularization method, Beta-Decay, is proposed to regularize the DARTS-based NAS searching process (i.e., $beta$-DARTS) In-depth theoretical analyses on how it works and why it works are provided.
arXiv Detail & Related papers (2023-01-16T12:30:32Z)
$\Lambda$-DARTS: Mitigating Performance Collapse by Harmonizing Operation Selection among Cells [11.777101481512423]
Differentiable neural architecture search (DARTS) is a popular method for neural architecture search (NAS) We show that DARTS suffers from a specific structural flaw due to its weight-sharing framework that limits the convergence of DARTS to saturation points of the softmax function. We propose two new regularization terms that aim to prevent performance collapse by harmonizing operation selection via aligning gradients of layers.
arXiv Detail & Related papers (2022-10-14T17:54:01Z)
Enhancing the Robustness, Efficiency, and Diversity of Differentiable Architecture Search [25.112048502327738]
Differentiable architecture search (DARTS) has attracted much attention due to its simplicity and significant improvement in efficiency. Many works attempt to restrict the accumulation of skip connections by indicators or manual design. We suggest a more subtle and direct approach that removes skip connections from the operation space.
arXiv Detail & Related papers (2022-04-10T13:25:36Z)
$\beta$-DARTS: Beta-Decay Regularization for Differentiable Architecture Search [85.84110365657455]
We propose a simple-but-efficient regularization method, termed as Beta-Decay, to regularize the DARTS-based NAS searching process. Experimental results on NAS-Bench-201 show that our proposed method can help to stabilize the searching process and makes the searched network more transferable across different datasets.
arXiv Detail & Related papers (2022-03-03T11:47:14Z)
iDARTS: Improving DARTS by Node Normalization and Decorrelation Discretization [51.489024258966886]
Differentiable ARchiTecture Search (DARTS) uses a continuous relaxation of network representation and dramatically accelerates Neural Architecture Search (NAS) by almost thousands of times in GPU-day. However, the searching process of DARTS is unstable, which suffers severe degradation when training epochs become large. We propose an improved version of DARTS, namely iDARTS, to deal with the two problems.
arXiv Detail & Related papers (2021-08-25T02:23:30Z)
MS-DARTS: Mean-Shift Based Differentiable Architecture Search [11.115656548869199]
We propose a Mean-Shift based DARTS (MS-DARTS) to improve stability based on sampling and perturbation. MS-DARTS archives higher performance over other state-of-the-art NAS methods with reduced search cost.
arXiv Detail & Related papers (2021-08-23T08:06:45Z)
$\mu$DARTS: Model Uncertainty-Aware Differentiable Architecture Search [8.024434062411943]
We introduce concrete dropout within DARTS cells and include a Monte-Carlo regularizer within the training loss to optimize the concrete dropout probabilities. Experiments on CIFAR10, CIFAR100, SVHN, and ImageNet verify the effectiveness of $mu$DARTS in improving accuracy and reducing uncertainty.
arXiv Detail & Related papers (2021-07-24T01:09:20Z)
iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients [75.41173109807735]
Differentiable ARchiTecture Search (DARTS) has recently become the mainstream of neural architecture search (NAS) We tackle the hypergradient computation in DARTS based on the implicit function theorem. We show that the architecture optimisation with the proposed method, named iDARTS, is expected to converge to a stationary point.
arXiv Detail & Related papers (2021-06-21T00:44:11Z)
DrNAS: Dirichlet Neural Architecture Search [88.56953713817545]
We treat the continuously relaxed architecture mixing weight as random variables, modeled by Dirichlet distribution. With recently developed pathwise derivatives, the Dirichlet parameters can be easily optimized with gradient-based generalization. To alleviate the large memory consumption of differentiable NAS, we propose a simple yet effective progressive learning scheme.
arXiv Detail & Related papers (2020-06-18T08:23:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.