Dynamic Slimmable Network
- URL: http://arxiv.org/abs/2103.13258v1
- Date: Wed, 24 Mar 2021 15:25:20 GMT
- Title: Dynamic Slimmable Network
- Authors: Changlin Li, Guangrun Wang, Bing Wang, Xiaodan Liang, Zhihui Li and
Xiaojun Chang
- Abstract summary: We develop a dynamic network slimming regime named Dynamic Slimmable Network (DS-Net)
Our DS-Net is empowered with the ability of dynamic inference by the proposed double-headed dynamic gate.
It consistently outperforms its static counterparts as well as state-of-the-art static and dynamic model compression methods.
- Score: 105.74546828182834
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Current dynamic networks and dynamic pruning methods have shown their
promising capability in reducing theoretical computation complexity. However,
dynamic sparse patterns on convolutional filters fail to achieve actual
acceleration in real-world implementation, due to the extra burden of indexing,
weight-copying, or zero-masking. Here, we explore a dynamic network slimming
regime, named Dynamic Slimmable Network (DS-Net), which aims to achieve good
hardware-efficiency via dynamically adjusting filter numbers of networks at
test time with respect to different inputs, while keeping filters stored
statically and contiguously in hardware to prevent the extra burden. Our DS-Net
is empowered with the ability of dynamic inference by the proposed
double-headed dynamic gate that comprises an attention head and a slimming head
to predictively adjust network width with negligible extra computation cost. To
ensure generality of each candidate architecture and the fairness of gate, we
propose a disentangled two-stage training scheme inspired by one-shot NAS. In
the first stage, a novel training technique for weight-sharing networks named
In-place Ensemble Bootstrapping is proposed to improve the supernet training
efficacy. In the second stage, Sandwich Gate Sparsification is proposed to
assist the gate training by identifying easy and hard samples in an online way.
Extensive experiments demonstrate our DS-Net consistently outperforms its
static counterparts as well as state-of-the-art static and dynamic model
compression methods by a large margin (up to 5.9%). Typically, DS-Net achieves
2-4x computation reduction and 1.62x real-world acceleration over ResNet-50 and
MobileNet with minimal accuracy drops on ImageNet. Code release:
https://github.com/changlin31/DS-Net .
Related papers
- Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch [72.26822499434446]
Auto-Train-Once (ATO) is an innovative network pruning algorithm designed to automatically reduce the computational and storage costs of DNNs.
We provide a comprehensive convergence analysis as well as extensive experiments, and the results show that our approach achieves state-of-the-art performance across various model architectures.
arXiv Detail & Related papers (2024-03-21T02:33:37Z) - Dynamic DNNs and Runtime Management for Efficient Inference on
Mobile/Embedded Devices [2.8851756275902476]
Deep neural network (DNN) inference is increasingly being executed on mobile and embedded platforms.
We co-designed novel Dynamic Super-Networks to maximise system-level performance and energy efficiency.
Compared with SOTA, our experimental results using ImageNet on the GPU of Jetson Xavier NX show our model is 2.4x faster for similar ImageNet Top-1 accuracy, or 5.1% higher accuracy at similar latency.
arXiv Detail & Related papers (2024-01-17T04:40:30Z) - Latency-aware Unified Dynamic Networks for Efficient Image Recognition [72.8951331472913]
LAUDNet is a framework to bridge the theoretical and practical efficiency gap in dynamic networks.
It integrates three primary dynamic paradigms-spatially adaptive computation, dynamic layer skipping, and dynamic channel skipping.
It can notably reduce the latency of models like ResNet by over 50% on platforms such as V100,3090, and TX2 GPUs.
arXiv Detail & Related papers (2023-08-30T10:57:41Z) - PAD-Net: An Efficient Framework for Dynamic Networks [72.85480289152719]
Common practice in implementing dynamic networks is to convert the given static layers into fully dynamic ones.
We propose a partially dynamic network, namely PAD-Net, to transform the redundant dynamic parameters into static ones.
Our method is comprehensively supported by large-scale experiments with two typical advanced dynamic architectures.
arXiv Detail & Related papers (2022-11-10T12:42:43Z) - Efficient Sparsely Activated Transformers [0.34410212782758054]
Transformer-based neural networks have achieved state-of-the-art task performance in a number of machine learning domains.
Recent work has explored the integration of dynamic behavior into these networks in the form of mixture-of-expert layers.
We introduce a novel system named PLANER that takes an existing Transformer-based network and a user-defined latency target.
arXiv Detail & Related papers (2022-08-31T00:44:27Z) - Dynamic Slimmable Denoising Network [64.77565006158895]
Dynamic slimmable denoising network (DDSNet) is a general method to achieve good denoising quality with less computational complexity.
OurNet is empowered with the ability of dynamic inference by a dynamic gate.
Our experiments demonstrate our-Net consistently outperforms the state-of-the-art individually trained static denoising networks.
arXiv Detail & Related papers (2021-10-17T22:45:33Z) - DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and
Transformers [105.74546828182834]
We show a hardware-efficient dynamic inference regime, named dynamic weight slicing, which adaptively slice a part of network parameters for inputs with diverse difficulty levels.
We present dynamic slimmable network (DS-Net) and dynamic slice-able network (DS-Net++) by input-dependently adjusting filter numbers of CNNs and multiple dimensions in both CNNs and transformers.
arXiv Detail & Related papers (2021-09-21T09:57:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.