Partial Connection Based on Channel Attention for Differentiable Neural
Architecture Search
- URL: http://arxiv.org/abs/2208.00791v1
- Date: Mon, 1 Aug 2022 12:05:55 GMT
- Title: Partial Connection Based on Channel Attention for Differentiable Neural
Architecture Search
- Authors: Yu Xue, Jiafeng Qin
- Abstract summary: Differentiable neural architecture search (DARTS) is a gradient-guided search method.
The parameters of some weight-equipped operations may not be trained well in the initial stage.
A partial channel connection based on channel attention for differentiable neural architecture search (ADARTS) is proposed.
- Score: 1.1125818448814198
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differentiable neural architecture search (DARTS), as a gradient-guided
search method, greatly reduces the cost of computation and speeds up the
search. In DARTS, the architecture parameters are introduced to the candidate
operations, but the parameters of some weight-equipped operations may not be
trained well in the initial stage, which causes unfair competition between
candidate operations. The weight-free operations appear in large numbers which
results in the phenomenon of performance crash. Besides, a lot of memory will
be occupied during training supernet which causes the memory utilization to be
low. In this paper, a partial channel connection based on channel attention for
differentiable neural architecture search (ADARTS) is proposed. Some channels
with higher weights are selected through the attention mechanism and sent into
the operation space while the other channels are directly contacted with the
processed channels. Selecting a few channels with higher attention weights can
better transmit important feature information into the search space and greatly
improve search efficiency and memory utilization. The instability of network
structure caused by random selection can also be avoided. The experimental
results show that ADARTS achieved 2.46% and 17.06% classification error rates
on CIFAR-10 and CIFAR-100, respectively. ADARTS can effectively solve the
problem that too many skip connections appear in the search process and obtain
network structures with better performance.
Related papers
- Efficient Visual Fault Detection for Freight Train via Neural Architecture Search with Data Volume Robustness [6.5769476554745925]
We propose an efficient NAS-based framework for visual fault detection of freight trains.
First, we design a scale-aware search space for discovering an effective receptive field in the head.
Second, we explore the robustness of data volume to reduce search costs based on the specifically designed search space.
arXiv Detail & Related papers (2024-05-27T09:47:49Z) - Revisiting Random Channel Pruning for Neural Network Compression [159.99002793644163]
Channel (or 3D filter) pruning serves as an effective way to accelerate the inference of neural networks.
In this paper, we try to determine the channel configuration of the pruned models by random search.
We show that this simple strategy works quite well compared with other channel pruning methods.
arXiv Detail & Related papers (2022-05-11T17:59:04Z) - $\beta$-DARTS: Beta-Decay Regularization for Differentiable Architecture
Search [85.84110365657455]
We propose a simple-but-efficient regularization method, termed as Beta-Decay, to regularize the DARTS-based NAS searching process.
Experimental results on NAS-Bench-201 show that our proposed method can help to stabilize the searching process and makes the searched network more transferable across different datasets.
arXiv Detail & Related papers (2022-03-03T11:47:14Z) - CATRO: Channel Pruning via Class-Aware Trace Ratio Optimization [61.71504948770445]
We propose a novel channel pruning method via Class-Aware Trace Ratio Optimization (CATRO) to reduce the computational burden and accelerate the model inference.
We show that CATRO achieves higher accuracy with similar cost or lower cost with similar accuracy than other state-of-the-art channel pruning algorithms.
Because of its class-aware property, CATRO is suitable to prune efficient networks adaptively for various classification subtasks, enhancing handy deployment and usage of deep networks in real-world applications.
arXiv Detail & Related papers (2021-10-21T06:26:31Z) - RepNAS: Searching for Efficient Re-parameterizing Blocks [4.146471448631912]
RepNAS, a one-stage NAS approach, is present to efficiently search the optimal diverse branch block(ODBB) for each layer under the branch number constraint.
Our experimental results show the searched ODBB can easily surpass the manual diverse branch block(DBB) with efficient training.
arXiv Detail & Related papers (2021-09-08T09:04:59Z) - D-DARTS: Distributed Differentiable Architecture Search [75.12821786565318]
Differentiable ARchiTecture Search (DARTS) is one of the most trending Neural Architecture Search (NAS) methods.
We propose D-DARTS, a novel solution that addresses this problem by nesting several neural networks at cell-level.
arXiv Detail & Related papers (2021-08-20T09:07:01Z) - Group Fisher Pruning for Practical Network Compression [58.25776612812883]
We present a general channel pruning approach that can be applied to various complicated structures.
We derive a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels.
Our method can be used to prune any structures including those with coupled channels.
arXiv Detail & Related papers (2021-08-02T08:21:44Z) - Partially-Connected Differentiable Architecture Search for Deepfake and
Spoofing Detection [14.792884010821762]
This paper reports the first successful application of a differentiable architecture search (DARTS) approach to the deepfake and spoofing detection problems.
DARTS operates upon a continuous, differentiable search space which enables both the architecture and parameters to be optimised via gradient descent.
arXiv Detail & Related papers (2021-04-07T13:53:20Z) - ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse
Coding [86.40042104698792]
We formulate neural architecture search as a sparse coding problem.
In experiments, our two-stage method on CIFAR-10 requires only 0.05 GPU-day for search.
Our one-stage method produces state-of-the-art performances on both CIFAR-10 and ImageNet at the cost of only evaluation time.
arXiv Detail & Related papers (2020-10-13T04:34:24Z) - BS-NAS: Broadening-and-Shrinking One-Shot NAS with Searchable Numbers of
Channels [25.43631259260473]
One-Shot methods have evolved into one of the most popular methods in Neural Architecture Search (NAS)
One-Shot methods have evolved into one of the most popular methods in Neural Architecture Search (NAS) due to weight sharing and single training of a supernet.
Existing methods generally suffer from two issues: predetermined number of channels in each layer which is suboptimal; and model averaging effects and poor ranking correlation caused by weight coupling and continuously expanding search space.
A Broadening-and-Shrinking One-Shot NAS (BS-NAS) framework is proposed, in which broadening' refers to broadening the search
arXiv Detail & Related papers (2020-03-22T06:32:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.