Related papers: Slimmable Pruned Neural Networks

Slimmable Pruned Neural Networks

URL: http://arxiv.org/abs/2212.03415v1
Date: Wed, 7 Dec 2022 02:54:15 GMT
Title: Slimmable Pruned Neural Networks
Authors: Hideaki Kuratsu and Atsuyoshi Nakamura
Abstract summary: The accuracy of each sub-network on S-Net is inferior to that of individually trained networks of the same size. We propose Slimmable Pruned Neural Networks (SP-Net) which has sub-network structures learned by pruning. SP-Net can be combined with any kind of channel pruning methods and does not require any complicated processing or time-consuming architecture search like NAS models.
Score: 1.8275108630751844
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Slimmable Neural Networks (S-Net) is a novel network which enabled to select one of the predefined proportions of channels (sub-network) dynamically depending on the current computational resource availability. The accuracy of each sub-network on S-Net, however, is inferior to that of individually trained networks of the same size due to its difficulty of simultaneous optimization on different sub-networks. In this paper, we propose Slimmable Pruned Neural Networks (SP-Net), which has sub-network structures learned by pruning instead of adopting structures with the same proportion of channels in each layer (width multiplier) like S-Net, and we also propose new pruning procedures: multi-base pruning instead of one-shot or iterative pruning to realize high accuracy and huge training time saving. We also introduced slimmable channel sorting (scs) to achieve calculation as fast as S-Net and zero padding match (zpm) pruning to prune residual structure in more efficient way. SP-Net can be combined with any kind of channel pruning methods and does not require any complicated processing or time-consuming architecture search like NAS models. Compared with each sub-network of the same FLOPs on S-Net, SP-Net improves accuracy by 1.2-1.5% for ResNet-50, 0.9-4.4% for VGGNet, 1.3-2.7% for MobileNetV1, 1.4-3.1% for MobileNetV2 on ImageNet. Furthermore, our methods outperform other SOTA pruning methods and are on par with various NAS models according to our experimental results on ImageNet. The code is available at https://github.com/hideakikuratsu/SP-Net.

Related papers

Connection Reduction Is All You Need [0.10878040851637998]
Empirical research shows that simply stacking convolutional layers does not make the network train better. We propose two new algorithms to connect layers. ShortNet1 has a 5% lower test error rate and 25% faster inference time than Baseline.
arXiv Detail & Related papers (2022-08-02T13:00:35Z)
Lightweight and Progressively-Scalable Networks for Semantic Segmentation [100.63114424262234]
Multi-scale learning frameworks have been regarded as a capable class of models to boost semantic segmentation. In this paper, we thoroughly analyze the design of convolutional blocks and the ways of interactions across multiple scales. We devise Lightweight and Progressively-Scalable Networks (LPS-Net) that novelly expands the network complexity in a greedy manner.
arXiv Detail & Related papers (2022-07-27T16:00:28Z)
Searching for Network Width with Bilaterally Coupled Network [75.43658047510334]
We introduce a new supernet called Bilaterally Coupled Network (BCNet) to address this issue. In BCNet, each channel is fairly trained and responsible for the same amount of network widths, thus each network width can be evaluated more accurately. We propose the first open-source width benchmark on macro structures named Channel-Bench-Macro for the better comparison of width search algorithms.
arXiv Detail & Related papers (2022-03-25T15:32:46Z)
DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Transformers [105.74546828182834]
We show a hardware-efficient dynamic inference regime, named dynamic weight slicing, which adaptively slice a part of network parameters for inputs with diverse difficulty levels. We present dynamic slimmable network (DS-Net) and dynamic slice-able network (DS-Net++) by input-dependently adjusting filter numbers of CNNs and multiple dimensions in both CNNs and transformers.
arXiv Detail & Related papers (2021-09-21T09:57:21Z)
AdaPruner: Adaptive Channel Pruning and Effective Weights Inheritance [9.3421559369389]
We propose a pruning framework that adaptively determines the number of each layer's channels as well as the wights inheritance criteria for sub-network. AdaPruner allows to obtain pruned network quickly, accurately and efficiently. On ImageNet, we reduce 32.8% FLOPs of MobileNetV2 with only 0.62% decrease for top-1 accuracy, which exceeds all previous state-of-the-art channel pruning methods.
arXiv Detail & Related papers (2021-09-14T01:52:05Z)
Group Fisher Pruning for Practical Network Compression [58.25776612812883]
We present a general channel pruning approach that can be applied to various complicated structures. We derive a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels. Our method can be used to prune any structures including those with coupled channels.
arXiv Detail & Related papers (2021-08-02T08:21:44Z)
BCNet: Searching for Network Width with Bilaterally Coupled Network [56.14248440683152]
We introduce a new supernet called Bilaterally Coupled Network (BCNet) to address this issue. In BCNet, each channel is fairly trained and responsible for the same amount of network widths, thus each network width can be evaluated more accurately. Our method achieves state-of-the-art or competing performance over other baseline methods.
arXiv Detail & Related papers (2021-05-21T18:54:03Z)
BWCP: Probabilistic Learning-to-Prune Channels for ConvNets via Batch Whitening [63.081808698068365]
This work presents a probabilistic channel pruning method to accelerate Convolutional Neural Networks (CNNs) Previous pruning methods often zero out unimportant channels in training in a deterministic manner, which reduces CNN's learning capacity and results in suboptimal performance. We develop a probability-based pruning algorithm, called batch whitening channel pruning (BWCP), which canally discard unimportant channels by modeling the probability of a channel being activated.
arXiv Detail & Related papers (2021-05-13T17:00:05Z)
Knapsack Pruning with Inner Distillation [11.04321604965426]
We propose a novel pruning method that optimize the final accuracy of the pruned network. We prune the network channels while maintaining the high-level structure of the network. Our method leads to state-of-the-art pruning results on ImageNet, CIFAR-10 and CIFAR-100 using ResNet backbones.
arXiv Detail & Related papers (2020-02-19T16:04:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.