Joslim: Joint Widths and Weights Optimization for Slimmable Neural
Networks
- URL: http://arxiv.org/abs/2007.11752v4
- Date: Wed, 30 Jun 2021 14:38:29 GMT
- Title: Joslim: Joint Widths and Weights Optimization for Slimmable Neural
Networks
- Authors: Ting-Wu Chin, Ari S. Morcos, Diana Marculescu
- Abstract summary: We propose a general framework to enable joint optimization for both width configurations and weights of slimmable networks.
Our framework subsumes conventional and NAS-based slimmable methods as special cases and provides flexibility to improve over existing methods.
Improvements up to 1.7% and 8% in top-1 accuracy on the ImageNet dataset can be attained for MobileNetV2 considering FLOPs and memory footprint.
- Score: 37.09353669633368
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Slimmable neural networks provide a flexible trade-off front between
prediction error and computational requirement (such as the number of
floating-point operations or FLOPs) with the same storage requirement as a
single model. They are useful for reducing maintenance overhead for deploying
models to devices with different memory constraints and are useful for
optimizing the efficiency of a system with many CNNs. However, existing
slimmable network approaches either do not optimize layer-wise widths or
optimize the shared-weights and layer-wise widths independently, thereby
leaving significant room for improvement by joint width and weight
optimization. In this work, we propose a general framework to enable joint
optimization for both width configurations and weights of slimmable networks.
Our framework subsumes conventional and NAS-based slimmable methods as special
cases and provides flexibility to improve over existing methods. From a
practical standpoint, we propose Joslim, an algorithm that jointly optimizes
both the widths and weights for slimmable nets, which outperforms existing
methods for optimizing slimmable networks across various networks, datasets,
and objectives. Quantitatively, improvements up to 1.7% and 8% in top-1
accuracy on the ImageNet dataset can be attained for MobileNetV2 considering
FLOPs and memory footprint, respectively. Our results highlight the potential
of optimizing the channel counts for different layers jointly with the weights
for slimmable networks. Code available at https://github.com/cmu-enyac/Joslim.
Related papers
- Vertical Layering of Quantized Neural Networks for Heterogeneous
Inference [57.42762335081385]
We study a new vertical-layered representation of neural network weights for encapsulating all quantized models into a single one.
We can theoretically achieve any precision network for on-demand service while only needing to train and maintain one model.
arXiv Detail & Related papers (2022-12-10T15:57:38Z) - Channel-wise Mixed-precision Assignment for DNN Inference on Constrained
Edge Nodes [22.40937602825472]
State-of-the-art mixed-precision works layer-wise, i.e., it uses different bit-widths for the weights and activations tensors of each network layer.
We propose a novel NAS that selects the bit-width of each weight tensor channel independently.
Our networks reduce the memory and energy for inference by up to 63% and 27% respectively.
arXiv Detail & Related papers (2022-06-17T15:51:49Z) - DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and
Transformers [105.74546828182834]
We show a hardware-efficient dynamic inference regime, named dynamic weight slicing, which adaptively slice a part of network parameters for inputs with diverse difficulty levels.
We present dynamic slimmable network (DS-Net) and dynamic slice-able network (DS-Net++) by input-dependently adjusting filter numbers of CNNs and multiple dimensions in both CNNs and transformers.
arXiv Detail & Related papers (2021-09-21T09:57:21Z) - Efficient Multi-Objective Optimization for Deep Learning [2.0305676256390934]
Multi-objective optimization (MOO) is a prevalent challenge for Deep Learning.
There exists no scalable MOO solution for truly deep neural networks.
arXiv Detail & Related papers (2021-03-24T17:59:42Z) - Locally Free Weight Sharing for Network Width Search [55.155969155967284]
Searching for network width is an effective way to slim deep neural networks with hardware budgets.
We propose a locally free weight sharing strategy (CafeNet) to better evaluate each width.
Our method can further boost the benchmark NAS network EfficientNet-B0 by 0.41% via searching its width more delicately.
arXiv Detail & Related papers (2021-02-10T04:36:09Z) - Towards Lossless Binary Convolutional Neural Networks Using Piecewise
Approximation [4.023728681102073]
CNNs can significantly reduce the number of arithmetic operations and the size of memory storage.
However, the accuracy degradation of single and multiple binary CNNs is unacceptable for modern architectures.
We propose a Piecewise Approximation scheme for multiple binary CNNs which lessens accuracy loss by approximating full precision weights and activations.
arXiv Detail & Related papers (2020-08-08T13:32:33Z) - Fully Dynamic Inference with Deep Neural Networks [19.833242253397206]
Two compact networks, called Layer-Net (L-Net) and Channel-Net (C-Net), predict on a per-instance basis which layers or filters/channels are redundant and therefore should be skipped.
On the CIFAR-10 dataset, LC-Net results in up to 11.9$times$ fewer floating-point operations (FLOPs) and up to 3.3% higher accuracy compared to other dynamic inference methods.
On the ImageNet dataset, LC-Net achieves up to 1.4$times$ fewer FLOPs and up to 4.6% higher Top-1 accuracy than the other methods.
arXiv Detail & Related papers (2020-07-29T23:17:48Z) - WeightNet: Revisiting the Design Space of Weight Networks [96.48596945711562]
We present a conceptually simple, flexible and effective framework for weight generating networks.
Our approach is general that unifies two current distinct and extremely effective SENet and CondConv into the same framework on weight space.
arXiv Detail & Related papers (2020-07-23T06:49:01Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.