Related papers: Partial Connection Based on Channel Attention for Differentiable Neural Architecture Search

Partial Connection Based on Channel Attention for Differentiable Neural Architecture Search

URL: http://arxiv.org/abs/2208.00791v1
Date: Mon, 1 Aug 2022 12:05:55 GMT
Title: Partial Connection Based on Channel Attention for Differentiable Neural Architecture Search
Authors: Yu Xue, Jiafeng Qin
Abstract summary: Differentiable neural architecture search (DARTS) is a gradient-guided search method. The parameters of some weight-equipped operations may not be trained well in the initial stage. A partial channel connection based on channel attention for differentiable neural architecture search (ADARTS) is proposed.
Score: 1.1125818448814198
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Differentiable neural architecture search (DARTS), as a gradient-guided search method, greatly reduces the cost of computation and speeds up the search. In DARTS, the architecture parameters are introduced to the candidate operations, but the parameters of some weight-equipped operations may not be trained well in the initial stage, which causes unfair competition between candidate operations. The weight-free operations appear in large numbers which results in the phenomenon of performance crash. Besides, a lot of memory will be occupied during training supernet which causes the memory utilization to be low. In this paper, a partial channel connection based on channel attention for differentiable neural architecture search (ADARTS) is proposed. Some channels with higher weights are selected through the attention mechanism and sent into the operation space while the other channels are directly contacted with the processed channels. Selecting a few channels with higher attention weights can better transmit important feature information into the search space and greatly improve search efficiency and memory utilization. The instability of network structure caused by random selection can also be avoided. The experimental results show that ADARTS achieved 2.46% and 17.06% classification error rates on CIFAR-10 and CIFAR-100, respectively. ADARTS can effectively solve the problem that too many skip connections appear in the search process and obtain network structures with better performance.

Related papers

Efficient Visual Fault Detection for Freight Train via Neural Architecture Search with Data Volume Robustness [6.5769476554745925]
We propose an efficient NAS-based framework for visual fault detection of freight trains. First, we design a scale-aware search space for discovering an effective receptive field in the head. Second, we explore the robustness of data volume to reduce search costs based on the specifically designed search space.
arXiv Detail & Related papers (2024-05-27T09:47:49Z)
Revisiting Random Channel Pruning for Neural Network Compression [159.99002793644163]
Channel (or 3D filter) pruning serves as an effective way to accelerate the inference of neural networks. In this paper, we try to determine the channel configuration of the pruned models by random search. We show that this simple strategy works quite well compared with other channel pruning methods.
arXiv Detail & Related papers (2022-05-11T17:59:04Z)
Adaptive Channel Allocation for Robust Differentiable Architecture Search [22.898344333732044]
Differentiable ARchiTecture Search (DARTS) has attracted much attention due to its simplicity and significant improvement in efficiency. The excessive accumulation of the skip connection, when training epochs become large, makes it suffer from weak stability and low robustness. We propose a more subtle and direct approach that no longer explicitly searches for skip connections in the search stage.
arXiv Detail & Related papers (2022-04-10T13:25:36Z)
$\beta$-DARTS: Beta-Decay Regularization for Differentiable Architecture Search [85.84110365657455]
We propose a simple-but-efficient regularization method, termed as Beta-Decay, to regularize the DARTS-based NAS searching process. Experimental results on NAS-Bench-201 show that our proposed method can help to stabilize the searching process and makes the searched network more transferable across different datasets.
arXiv Detail & Related papers (2022-03-03T11:47:14Z)
CATRO: Channel Pruning via Class-Aware Trace Ratio Optimization [61.71504948770445]
We propose a novel channel pruning method via Class-Aware Trace Ratio Optimization (CATRO) to reduce the computational burden and accelerate the model inference. We show that CATRO achieves higher accuracy with similar cost or lower cost with similar accuracy than other state-of-the-art channel pruning algorithms. Because of its class-aware property, CATRO is suitable to prune efficient networks adaptively for various classification subtasks, enhancing handy deployment and usage of deep networks in real-world applications.
arXiv Detail & Related papers (2021-10-21T06:26:31Z)
RepNAS: Searching for Efficient Re-parameterizing Blocks [4.146471448631912]
RepNAS, a one-stage NAS approach, is present to efficiently search the optimal diverse branch block(ODBB) for each layer under the branch number constraint. Our experimental results show the searched ODBB can easily surpass the manual diverse branch block(DBB) with efficient training.
arXiv Detail & Related papers (2021-09-08T09:04:59Z)
D-DARTS: Distributed Differentiable Architecture Search [75.12821786565318]
Differentiable ARchiTecture Search (DARTS) is one of the most trending Neural Architecture Search (NAS) methods. We propose D-DARTS, a novel solution that addresses this problem by nesting several neural networks at cell-level.
arXiv Detail & Related papers (2021-08-20T09:07:01Z)
Group Fisher Pruning for Practical Network Compression [58.25776612812883]
We present a general channel pruning approach that can be applied to various complicated structures. We derive a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels. Our method can be used to prune any structures including those with coupled channels.
arXiv Detail & Related papers (2021-08-02T08:21:44Z)
Partially-Connected Differentiable Architecture Search for Deepfake and Spoofing Detection [14.792884010821762]
This paper reports the first successful application of a differentiable architecture search (DARTS) approach to the deepfake and spoofing detection problems. DARTS operates upon a continuous, differentiable search space which enables both the architecture and parameters to be optimised via gradient descent.
arXiv Detail & Related papers (2021-04-07T13:53:20Z)
ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding [86.40042104698792]
We formulate neural architecture search as a sparse coding problem. In experiments, our two-stage method on CIFAR-10 requires only 0.05 GPU-day for search. Our one-stage method produces state-of-the-art performances on both CIFAR-10 and ImageNet at the cost of only evaluation time.
arXiv Detail & Related papers (2020-10-13T04:34:24Z)
BS-NAS: Broadening-and-Shrinking One-Shot NAS with Searchable Numbers of Channels [25.43631259260473]
One-Shot methods have evolved into one of the most popular methods in Neural Architecture Search (NAS) One-Shot methods have evolved into one of the most popular methods in Neural Architecture Search (NAS) due to weight sharing and single training of a supernet. Existing methods generally suffer from two issues: predetermined number of channels in each layer which is suboptimal; and model averaging effects and poor ranking correlation caused by weight coupling and continuously expanding search space. A Broadening-and-Shrinking One-Shot NAS (BS-NAS) framework is proposed, in which broadening' refers to broadening the search
arXiv Detail & Related papers (2020-03-22T06:32:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.