G-DARTS-A: Groups of Channel Parallel Sampling with Attention
- URL: http://arxiv.org/abs/2010.08360v1
- Date: Fri, 16 Oct 2020 12:58:08 GMT
- Title: G-DARTS-A: Groups of Channel Parallel Sampling with Attention
- Authors: Zhaowen Wang, Wei Zhang, Zhiming Wang
- Abstract summary: We propose an approach named Group-DARTS with Attention (G-DARTS-A) using multiple groups of channels for searching.
Inspired by the partially sampling strategy of PC-DARTS, we use groups channels to sample the super-network to perform a more efficient search.
- Score: 22.343277522403742
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differentiable Architecture Search (DARTS) provides a baseline for searching
effective network architectures based gradient, but it is accompanied by huge
computational overhead in searching and training network architecture.
Recently, many novel works have improved DARTS. Particularly,
Partially-Connected DARTS(PC-DARTS) proposed the partial channel sampling
technique which achieved good results. In this work, we found that the backbone
provided by DARTS is prone to overfitting. To mitigate this problem, we propose
an approach named Group-DARTS with Attention (G-DARTS-A), using multiple groups
of channels for searching. Inspired by the partially sampling strategy of
PC-DARTS, we use groups channels to sample the super-network to perform a more
efficient search while maintaining the relative integrity of the network
information. In order to relieve the competition between channel groups and
keep channel balance, we follow the attention mechanism in
Squeeze-and-Excitation Network. Each group of channels shares defined weights
thence they can provide different suggestion for searching. The searched
architecture is more powerful and better adapted to different deployments.
Specifically, by only using the attention module on DARTS we achieved an error
rate of 2.82%/16.36% on CIFAR10/100 with 0.3GPU-days for search process on
CIFAR10. Apply our G-DARTS-A to DARTS/PC-DARTS, an error rate of 2.57%/2.61% on
CIFAR10 with 0.5/0.4 GPU-days is achieved.
Related papers
- Differentiable Architecture Search with Random Features [80.31916993541513]
Differentiable architecture search (DARTS) has significantly promoted the development of NAS techniques because of its high search efficiency and effectiveness but suffers from performance collapse.
In this paper, we make efforts to alleviate the performance collapse problem for DARTS with only training BatchNorm.
arXiv Detail & Related papers (2022-08-18T13:55:27Z) - Partial Connection Based on Channel Attention for Differentiable Neural
Architecture Search [1.1125818448814198]
Differentiable neural architecture search (DARTS) is a gradient-guided search method.
The parameters of some weight-equipped operations may not be trained well in the initial stage.
A partial channel connection based on channel attention for differentiable neural architecture search (ADARTS) is proposed.
arXiv Detail & Related papers (2022-08-01T12:05:55Z) - Searching for Network Width with Bilaterally Coupled Network [75.43658047510334]
We introduce a new supernet called Bilaterally Coupled Network (BCNet) to address this issue.
In BCNet, each channel is fairly trained and responsible for the same amount of network widths, thus each network width can be evaluated more accurately.
We propose the first open-source width benchmark on macro structures named Channel-Bench-Macro for the better comparison of width search algorithms.
arXiv Detail & Related papers (2022-03-25T15:32:46Z) - iDARTS: Improving DARTS by Node Normalization and Decorrelation
Discretization [51.489024258966886]
Differentiable ARchiTecture Search (DARTS) uses a continuous relaxation of network representation and dramatically accelerates Neural Architecture Search (NAS) by almost thousands of times in GPU-day.
However, the searching process of DARTS is unstable, which suffers severe degradation when training epochs become large.
We propose an improved version of DARTS, namely iDARTS, to deal with the two problems.
arXiv Detail & Related papers (2021-08-25T02:23:30Z) - D-DARTS: Distributed Differentiable Architecture Search [75.12821786565318]
Differentiable ARchiTecture Search (DARTS) is one of the most trending Neural Architecture Search (NAS) methods.
We propose D-DARTS, a novel solution that addresses this problem by nesting several neural networks at cell-level.
arXiv Detail & Related papers (2021-08-20T09:07:01Z) - Group Fisher Pruning for Practical Network Compression [58.25776612812883]
We present a general channel pruning approach that can be applied to various complicated structures.
We derive a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels.
Our method can be used to prune any structures including those with coupled channels.
arXiv Detail & Related papers (2021-08-02T08:21:44Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z) - RARTS: An Efficient First-Order Relaxed Architecture Search Method [5.491655566898372]
Differentiable architecture search (DARTS) is an effective method for data-driven neural network design based on solving a bilevel optimization problem.
We formulate a single level alternative and a relaxed architecture search (RARTS) method that utilizes the whole dataset in architecture learning via both data and network splitting.
For the task of searching topological architecture, i.e., the edges and the operations, RARTS obtains a higher accuracy and 60% reduction of computational cost than second-order DARTS on CIFAR-10.
arXiv Detail & Related papers (2020-08-10T04:55:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.