Related papers: Adaptive Channel Allocation for Robust Differentiable Architecture Search

Adaptive Channel Allocation for Robust Differentiable Architecture Search

URL: http://arxiv.org/abs/2204.04681v2
Date: Mon, 23 Dec 2024 04:48:08 GMT
Title: Adaptive Channel Allocation for Robust Differentiable Architecture Search
Authors: Chao Li, Jia Ning, Han Hu, Kun He,
Abstract summary: Differentiable ARchiTecture Search (DARTS) has attracted much attention due to its simplicity and significant improvement in efficiency.<n>The excessive accumulation of the skip connection, when training epochs become large, makes it suffer from weak stability and low robustness.<n>We propose a more subtle and direct approach that no longer explicitly searches for skip connections in the search stage.
Score: 22.898344333732044
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Differentiable ARchiTecture Search (DARTS) has attracted much attention due to its simplicity and significant improvement in efficiency. However, the excessive accumulation of the skip connection, when training epochs become large, makes it suffer from weak stability and low robustness, thus limiting its practical applications. Many works have attempted to restrict the accumulation of skip connections by indicators or manual design. These methods, however, are susceptible to human priors and hyper-parameters. In this work, we suggest a more subtle and direct approach that no longer explicitly searches for skip connections in the search stage, based on the paradox that skip connections were proposed to guarantee the performance of very deep networks, but the networks in the search stage of differentiable architecture search are actually very shallow. Instead, by introducing channel importance ranking and channel allocation strategy, the skip connections are implicitly searched and automatically refilled unimportant channels in the evaluation stage. Our method, dubbed Adaptive Channel Allocation (ACA) strategy, is a general-purpose approach for differentiable architecture search, which universally works in DARTS variants without introducing human priors, indicators, or hyper-parameters. Extensive experiments on various datasets and DARTS variants verify that the ACA strategy is the most effective one among existing methods in improving robustness and dealing with the collapse issue when training epochs become large.

Related papers

SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis [89.99161034065614]
Retrieval-augmented generation (RAG) systems have advanced large language models (LLMs) in complex deep search scenarios.<n>Existing approaches face critical limitations that lack high-quality training trajectories and suffer from distributional mismatches.<n>This paper introduces SimpleDeepSearcher, a framework that bridges the gap through strategic data engineering rather than complex training paradigms.
arXiv Detail & Related papers (2025-05-22T16:05:02Z)
Adaptive Anomaly Detection in Network Flows with Low-Rank Tensor Decompositions and Deep Unrolling [9.20186865054847]
Anomaly detection (AD) is increasingly recognized as a key component for ensuring the resilience of future communication systems. This work considers AD in network flows using incomplete measurements. We propose a novel block-successive convex approximation algorithm based on a regularized model-fitting objective. Inspired by Bayesian approaches, we extend the model architecture to perform online adaptation to per-flow and per-time-step statistics.
arXiv Detail & Related papers (2024-09-17T19:59:57Z)
Efficient Architecture Search via Bi-level Data Pruning [70.29970746807882]
This work pioneers an exploration into the critical role of dataset characteristics for DARTS bi-level optimization. We introduce a new progressive data pruning strategy that utilizes supernet prediction dynamics as the metric. Comprehensive evaluations on the NAS-Bench-201 search space, DARTS search space, and MobileNet-like search space validate that BDP reduces search costs by over 50%.
arXiv Detail & Related papers (2023-12-21T02:48:44Z)
Lightweight Diffusion Models with Distillation-Based Block Neural Architecture Search [55.41583104734349]
We propose to automatically remove structural redundancy in diffusion models with our proposed Diffusion Distillation-based Block-wise Neural Architecture Search (NAS) Given a larger pretrained teacher, we leverage DiffNAS to search for the smallest architecture which can achieve on-par or even better performance than the teacher. Different from previous block-wise NAS methods, DiffNAS contains a block-wise local search strategy and a retraining strategy with a joint dynamic loss.
arXiv Detail & Related papers (2023-11-08T12:56:59Z)
Robustifying DARTS by Eliminating Information Bypass Leakage via Explicit Sparse Regularization [8.93957397187611]
Differentiable architecture search (DARTS) is a promising end to end NAS method. Recent studies cast doubt on the basic underlying hypotheses of DARTS. We propose a novel sparse-regularized approximation and an efficient mixed-sparsity training scheme to robustify DARTS.
arXiv Detail & Related papers (2023-06-12T04:11:37Z)
Operation-level Progressive Differentiable Architecture Search [19.214462477848535]
We propose operation-level progressive differentiable neural architecture search (OPP-DARTS) to avoid skip connections aggregation. Our method's performance on CIFAR-10 is superior to the architecture found by standard DARTS.
arXiv Detail & Related papers (2023-02-11T09:18:01Z)
$\Lambda$-DARTS: Mitigating Performance Collapse by Harmonizing Operation Selection among Cells [11.777101481512423]
Differentiable neural architecture search (DARTS) is a popular method for neural architecture search (NAS) We show that DARTS suffers from a specific structural flaw due to its weight-sharing framework that limits the convergence of DARTS to saturation points of the softmax function. We propose two new regularization terms that aim to prevent performance collapse by harmonizing operation selection via aligning gradients of layers.
arXiv Detail & Related papers (2022-10-14T17:54:01Z)
Partial Connection Based on Channel Attention for Differentiable Neural Architecture Search [1.1125818448814198]
Differentiable neural architecture search (DARTS) is a gradient-guided search method. The parameters of some weight-equipped operations may not be trained well in the initial stage. A partial channel connection based on channel attention for differentiable neural architecture search (ADARTS) is proposed.
arXiv Detail & Related papers (2022-08-01T12:05:55Z)
$\beta$-DARTS: Beta-Decay Regularization for Differentiable Architecture Search [85.84110365657455]
We propose a simple-but-efficient regularization method, termed as Beta-Decay, to regularize the DARTS-based NAS searching process. Experimental results on NAS-Bench-201 show that our proposed method can help to stabilize the searching process and makes the searched network more transferable across different datasets.
arXiv Detail & Related papers (2022-03-03T11:47:14Z)
CATRO: Channel Pruning via Class-Aware Trace Ratio Optimization [61.71504948770445]
We propose a novel channel pruning method via Class-Aware Trace Ratio Optimization (CATRO) to reduce the computational burden and accelerate the model inference. We show that CATRO achieves higher accuracy with similar cost or lower cost with similar accuracy than other state-of-the-art channel pruning algorithms. Because of its class-aware property, CATRO is suitable to prune efficient networks adaptively for various classification subtasks, enhancing handy deployment and usage of deep networks in real-world applications.
arXiv Detail & Related papers (2021-10-21T06:26:31Z)
Learning to Perform Downlink Channel Estimation in Massive MIMO Systems [72.76968022465469]
We study downlink (DL) channel estimation in a Massive multiple-input multiple-output (MIMO) system. A common approach is to use the mean value as the estimate, motivated by channel hardening. We propose two novel estimation methods.
arXiv Detail & Related papers (2021-09-06T13:42:32Z)
iDARTS: Improving DARTS by Node Normalization and Decorrelation Discretization [51.489024258966886]
Differentiable ARchiTecture Search (DARTS) uses a continuous relaxation of network representation and dramatically accelerates Neural Architecture Search (NAS) by almost thousands of times in GPU-day. However, the searching process of DARTS is unstable, which suffers severe degradation when training epochs become large. We propose an improved version of DARTS, namely iDARTS, to deal with the two problems.
arXiv Detail & Related papers (2021-08-25T02:23:30Z)
MS-DARTS: Mean-Shift Based Differentiable Architecture Search [11.115656548869199]
We propose a Mean-Shift based DARTS (MS-DARTS) to improve stability based on sampling and perturbation. MS-DARTS archives higher performance over other state-of-the-art NAS methods with reduced search cost.
arXiv Detail & Related papers (2021-08-23T08:06:45Z)
Decoupled and Memory-Reinforced Networks: Towards Effective Feature Learning for One-Step Person Search [65.51181219410763]
One-step methods have been developed to handle pedestrian detection and identification sub-tasks using a single network. There are two major challenges in the current one-step approaches. We propose a decoupled and memory-reinforced network (DMRNet) to overcome these problems.
arXiv Detail & Related papers (2021-02-22T06:19:45Z)
DARTS-: Robustly Stepping out of Performance Collapse Without Indicators [74.21019737169675]
Differentiable architecture search suffers from long-standing performance instability. indicators such as Hessian eigenvalues are proposed as a signal to stop searching before the performance collapses. In this paper, we undertake a more subtle and direct approach to resolve the collapse.
arXiv Detail & Related papers (2020-09-02T12:54:13Z)
Theory-Inspired Path-Regularized Differential Network Architecture Search [206.93821077400733]
We study the impact of skip connections to fast network optimization and its competitive advantage over other types of operations in differential architecture search (DARTS) We propose a theory-inspired path-regularized DARTS that consists of two key modules: (i) a differential group-structured sparse binary gate introduced for each operation to avoid unfair competition among operations, and (ii) a path-depth-wise regularization used to incite search exploration for deep architectures that converge slower than shallow ones.
arXiv Detail & Related papers (2020-06-30T05:28:23Z)
Stabilizing Differentiable Architecture Search via Perturbation-based Regularization [99.81980366552408]
We find that the precipitous validation loss landscape, which leads to a dramatic performance drop when distilling the final architecture, is an essential factor that causes instability. We propose a perturbation-based regularization - SmoothDARTS (SDARTS) - to smooth the loss landscape and improve the generalizability of DARTS-based methods.
arXiv Detail & Related papers (2020-02-12T23:46:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.