Related papers: BCNet: Searching for Network Width with Bilaterally Coupled Network

BCNet: Searching for Network Width with Bilaterally Coupled Network

URL: http://arxiv.org/abs/2105.10533v1
Date: Fri, 21 May 2021 18:54:03 GMT
Title: BCNet: Searching for Network Width with Bilaterally Coupled Network
Authors: Xiu Su, Shan You, Fei Wang, Chen Qian, Changshui Zhang, Chang Xu
Abstract summary: We introduce a new supernet called Bilaterally Coupled Network (BCNet) to address this issue. In BCNet, each channel is fairly trained and responsible for the same amount of network widths, thus each network width can be evaluated more accurately. Our method achieves state-of-the-art or competing performance over other baseline methods.
Score: 56.14248440683152
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Searching for a more compact network width recently serves as an effective way of channel pruning for the deployment of convolutional neural networks (CNNs) under hardware constraints. To fulfill the searching, a one-shot supernet is usually leveraged to efficiently evaluate the performance \wrt~different network widths. However, current methods mainly follow a \textit{unilaterally augmented} (UA) principle for the evaluation of each width, which induces the training unfairness of channels in supernet. In this paper, we introduce a new supernet called Bilaterally Coupled Network (BCNet) to address this issue. In BCNet, each channel is fairly trained and responsible for the same amount of network widths, thus each network width can be evaluated more accurately. Besides, we leverage a stochastic complementary strategy for training the BCNet, and propose a prior initial population sampling method to boost the performance of the evolutionary search. Extensive experiments on benchmark CIFAR-10 and ImageNet datasets indicate that our method can achieve state-of-the-art or competing performance over other baseline methods. Moreover, our method turns out to further boost the performance of NAS models by refining their network widths. For example, with the same FLOPs budget, our obtained EfficientNet-B0 achieves 77.36\% Top-1 accuracy on ImageNet dataset, surpassing the performance of original setting by 0.48\%.

Related papers

PSE-Net: Channel Pruning for Convolutional Neural Networks with Parallel-subnets Estimator [16.698190973547362]
We introduce PSE-Net, a novel parallel-subnets estimator for efficient channel pruning. Our proposed algorithm facilitates the efficiency of supernet training. We develop a prior-distributed-based sampling algorithm to boost the performance of classical evolutionary search.
arXiv Detail & Related papers (2024-08-29T03:20:43Z)
Slimmable Pruned Neural Networks [1.8275108630751844]
The accuracy of each sub-network on S-Net is inferior to that of individually trained networks of the same size. We propose Slimmable Pruned Neural Networks (SP-Net) which has sub-network structures learned by pruning. SP-Net can be combined with any kind of channel pruning methods and does not require any complicated processing or time-consuming architecture search like NAS models.
arXiv Detail & Related papers (2022-12-07T02:54:15Z)
Searching for Network Width with Bilaterally Coupled Network [75.43658047510334]
We introduce a new supernet called Bilaterally Coupled Network (BCNet) to address this issue. In BCNet, each channel is fairly trained and responsible for the same amount of network widths, thus each network width can be evaluated more accurately. We propose the first open-source width benchmark on macro structures named Channel-Bench-Macro for the better comparison of width search algorithms.
arXiv Detail & Related papers (2022-03-25T15:32:46Z)
AdaPruner: Adaptive Channel Pruning and Effective Weights Inheritance [9.3421559369389]
We propose a pruning framework that adaptively determines the number of each layer's channels as well as the wights inheritance criteria for sub-network. AdaPruner allows to obtain pruned network quickly, accurately and efficiently. On ImageNet, we reduce 32.8% FLOPs of MobileNetV2 with only 0.62% decrease for top-1 accuracy, which exceeds all previous state-of-the-art channel pruning methods.
arXiv Detail & Related papers (2021-09-14T01:52:05Z)
Locally Free Weight Sharing for Network Width Search [55.155969155967284]
Searching for network width is an effective way to slim deep neural networks with hardware budgets. We propose a locally free weight sharing strategy (CafeNet) to better evaluate each width. Our method can further boost the benchmark NAS network EfficientNet-B0 by 0.41% via searching its width more delicately.
arXiv Detail & Related papers (2021-02-10T04:36:09Z)
Go Wide, Then Narrow: Efficient Training of Deep Thin Networks [62.26044348366186]
We propose an efficient method to train a deep thin network with a theoretic guarantee. By training with our method, ResNet50 can outperform ResNet101, and BERT Base can be comparable with BERT Large.
arXiv Detail & Related papers (2020-07-01T23:34:35Z)
Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs. Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
ReActNet: Towards Precise Binary Neural Network with Generalized Activation Functions [76.05981545084738]
We propose several ideas for enhancing a binary network to close its accuracy gap from real-valued networks without incurring any additional computational cost. We first construct a baseline network by modifying and binarizing a compact real-valued network with parameter-free shortcuts. We show that the proposed ReActNet outperforms all the state-of-the-arts by a large margin.
arXiv Detail & Related papers (2020-03-07T02:12:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.