Regularizing Differentiable Architecture Search with Smooth Activation
- URL: http://arxiv.org/abs/2504.16306v1
- Date: Tue, 22 Apr 2025 22:49:38 GMT
- Title: Regularizing Differentiable Architecture Search with Smooth Activation
- Authors: Yanlin Zhou, Mostafa El-Khamy, Kee-Bong Song,
- Abstract summary: Differentiable Architecture Search (DARTS) is an efficient Neural Architecture Search (NAS) method but suffers from robustness, generalization, and discrepancy issues.<n>We propose Smooth Activation DARTS (SA-DARTS) to overcome skip dominance and discretization discrepancy challenges.<n>We show that SA-DARTS can help improve the performance of SOTA models with fewer parameters, such as Information Multi-distillation Network on the super-resolution task.
- Score: 10.658697052636272
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differentiable Architecture Search (DARTS) is an efficient Neural Architecture Search (NAS) method but suffers from robustness, generalization, and discrepancy issues. Many efforts have been made towards the performance collapse issue caused by skip dominance with various regularization techniques towards operation weights, path weights, noise injection, and super-network redesign. It had become questionable at a certain point if there could exist a better and more elegant way to retract the search to its intended goal -- NAS is a selection problem. In this paper, we undertake a simple but effective approach, named Smooth Activation DARTS (SA-DARTS), to overcome skip dominance and discretization discrepancy challenges. By leveraging a smooth activation function on architecture weights as an auxiliary loss, our SA-DARTS mitigates the unfair advantage of weight-free operations, converging to fanned-out architecture weight values, and can recover the search process from skip-dominance initialization. Through theoretical and empirical analysis, we demonstrate that the SA-DARTS can yield new state-of-the-art (SOTA) results on NAS-Bench-201, classification, and super-resolution. Further, we show that SA-DARTS can help improve the performance of SOTA models with fewer parameters, such as Information Multi-distillation Network on the super-resolution task.
Related papers
- OStr-DARTS: Differentiable Neural Architecture Search based on Operation Strength [70.76342136866413]
Differentiable architecture search (DARTS) has emerged as a promising technique for effective neural architecture search.
DARTS suffers from the well-known degeneration issue which can lead to deteriorating architectures.
We propose a novel criterion based on operation strength that estimates the importance of an operation by its effect on the final loss.
arXiv Detail & Related papers (2024-09-22T13:16:07Z) - IS-DARTS: Stabilizing DARTS through Precise Measurement on Candidate
Importance [41.23462863659102]
DARTS is known for its efficiency and simplicity.
However, performance collapse in DARTS results in deteriorating architectures filled with parameter-free operations.
We propose IS-DARTS to comprehensively improve DARTS and resolve the aforementioned problems.
arXiv Detail & Related papers (2023-12-19T22:45:57Z) - $\beta$-DARTS++: Bi-level Regularization for Proxy-robust Differentiable
Architecture Search [96.99525100285084]
Regularization method, Beta-Decay, is proposed to regularize the DARTS-based NAS searching process (i.e., $beta$-DARTS)
In-depth theoretical analyses on how it works and why it works are provided.
arXiv Detail & Related papers (2023-01-16T12:30:32Z) - $\beta$-DARTS: Beta-Decay Regularization for Differentiable Architecture
Search [85.84110365657455]
We propose a simple-but-efficient regularization method, termed as Beta-Decay, to regularize the DARTS-based NAS searching process.
Experimental results on NAS-Bench-201 show that our proposed method can help to stabilize the searching process and makes the searched network more transferable across different datasets.
arXiv Detail & Related papers (2022-03-03T11:47:14Z) - MS-DARTS: Mean-Shift Based Differentiable Architecture Search [11.115656548869199]
We propose a Mean-Shift based DARTS (MS-DARTS) to improve stability based on sampling and perturbation.
MS-DARTS archives higher performance over other state-of-the-art NAS methods with reduced search cost.
arXiv Detail & Related papers (2021-08-23T08:06:45Z) - D-DARTS: Distributed Differentiable Architecture Search [75.12821786565318]
Differentiable ARchiTecture Search (DARTS) is one of the most trending Neural Architecture Search (NAS) methods.
We propose D-DARTS, a novel solution that addresses this problem by nesting several neural networks at cell-level.
arXiv Detail & Related papers (2021-08-20T09:07:01Z) - ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and
Gradient Accumulation [106.04777600352743]
Differentiable architecture search (DARTS) is largely hindered by its substantial memory cost since the entire supernet resides in the memory.
The single-path DARTS comes in, which only chooses a single-path submodel at each step.
While being memory-friendly, it also comes with low computational costs.
We propose a new algorithm called RObustifying Memory-Efficient NAS (ROME) to give a cure.
arXiv Detail & Related papers (2020-11-23T06:34:07Z) - DrNAS: Dirichlet Neural Architecture Search [88.56953713817545]
We treat the continuously relaxed architecture mixing weight as random variables, modeled by Dirichlet distribution.
With recently developed pathwise derivatives, the Dirichlet parameters can be easily optimized with gradient-based generalization.
To alleviate the large memory consumption of differentiable NAS, we propose a simple yet effective progressive learning scheme.
arXiv Detail & Related papers (2020-06-18T08:23:02Z) - Stabilizing Differentiable Architecture Search via Perturbation-based
Regularization [99.81980366552408]
We find that the precipitous validation loss landscape, which leads to a dramatic performance drop when distilling the final architecture, is an essential factor that causes instability.
We propose a perturbation-based regularization - SmoothDARTS (SDARTS) - to smooth the loss landscape and improve the generalizability of DARTS-based methods.
arXiv Detail & Related papers (2020-02-12T23:46:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.