$S^3$: Sign-Sparse-Shift Reparametrization for Effective Training of
Low-bit Shift Networks
- URL: http://arxiv.org/abs/2107.03453v1
- Date: Wed, 7 Jul 2021 19:33:02 GMT
- Title: $S^3$: Sign-Sparse-Shift Reparametrization for Effective Training of
Low-bit Shift Networks
- Authors: Xinlin Li, Bang Liu, Yaoliang Yu, Wulong Liu, Chunjing Xu, Vahid
Partovi Nia
- Abstract summary: Shift neural networks reduce complexity by removing expensive multiplication operations and quantizing continuous weights into low-bit discrete values.
Our proposed training method pushes the boundaries of shift neural networks and shows 3-bit shift networks out-performs their full-precision counterparts in terms of top-1 accuracy on ImageNet.
- Score: 41.54155265996312
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Shift neural networks reduce computation complexity by removing expensive
multiplication operations and quantizing continuous weights into low-bit
discrete values, which are fast and energy efficient compared to conventional
neural networks. However, existing shift networks are sensitive to the weight
initialization, and also yield a degraded performance caused by vanishing
gradient and weight sign freezing problem. To address these issues, we propose
S low-bit re-parameterization, a novel technique for training low-bit shift
networks. Our method decomposes a discrete parameter in a sign-sparse-shift
3-fold manner. In this way, it efficiently learns a low-bit network with a
weight dynamics similar to full-precision networks and insensitive to weight
initialization. Our proposed training method pushes the boundaries of shift
neural networks and shows 3-bit shift networks out-performs their
full-precision counterparts in terms of top-1 accuracy on ImageNet.
Related papers
- Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - DenseShift: Towards Accurate and Efficient Low-Bit Power-of-Two
Quantization [27.231327287238102]
We propose the DenseShift network, which significantly improves the accuracy of Shift networks.
Our experiments on various computer vision and speech tasks demonstrate that DenseShift outperforms existing low-bit multiplication-free networks.
arXiv Detail & Related papers (2022-08-20T15:17:40Z) - BiTAT: Neural Network Binarization with Task-dependent Aggregated
Transformation [116.26521375592759]
Quantization aims to transform high-precision weights and activations of a given neural network into low-precision weights/activations for reduced memory usage and computation.
Extreme quantization (1-bit weight/1-bit activations) of compactly-designed backbone architectures results in severe performance degeneration.
This paper proposes a novel Quantization-Aware Training (QAT) method that can effectively alleviate performance degeneration.
arXiv Detail & Related papers (2022-07-04T13:25:49Z) - Hessian Aware Quantization of Spiking Neural Networks [1.90365714903665]
Neuromorphic architecture allows massively parallel computation with variable and local bit-precisions.
Current gradient based methods of SNN training use a complex neuron model with multiple state variables.
We present a simplified neuron model that reduces the number of state variables by 4-fold while still being compatible with gradient based training.
arXiv Detail & Related papers (2021-04-29T05:27:34Z) - Direct Quantization for Training Highly Accurate Low Bit-width Deep
Neural Networks [73.29587731448345]
This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations.
First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights.
Second, to obtain low bit-width activations, existing works consider all channels equally.
arXiv Detail & Related papers (2020-12-26T15:21:18Z) - Scalable Verification of Quantized Neural Networks (Technical Report) [14.04927063847749]
We show that bit-exact implementation of quantized neural networks with bit-vector specifications is PSPACE-hard.
We propose three techniques for making SMT-based verification of quantized neural networks more scalable.
arXiv Detail & Related papers (2020-12-15T10:05:37Z) - LOss-Based SensiTivity rEgulaRization: towards deep sparse neural
networks [15.373764014931792]
LOss-Based SensiTivity rEgulaRization is a method for training neural networks with a sparse topology.
Our method allows to train a network from scratch, i.e. without preliminary learning or rewinding.
arXiv Detail & Related papers (2020-11-16T18:55:34Z) - Improve Generalization and Robustness of Neural Networks via Weight
Scale Shifting Invariant Regularizations [52.493315075385325]
We show that a family of regularizers, including weight decay, is ineffective at penalizing the intrinsic norms of weights for networks with homogeneous activation functions.
We propose an improved regularizer that is invariant to weight scale shifting and thus effectively constrains the intrinsic norm of a neural network.
arXiv Detail & Related papers (2020-08-07T02:55:28Z) - WrapNet: Neural Net Inference with Ultra-Low-Resolution Arithmetic [57.07483440807549]
We propose a method that adapts neural networks to use low-resolution (8-bit) additions in the accumulators, achieving classification accuracy comparable to their 32-bit counterparts.
We demonstrate the efficacy of our approach on both software and hardware platforms.
arXiv Detail & Related papers (2020-07-26T23:18:38Z) - Mixed-Precision Quantized Neural Network with Progressively Decreasing
Bitwidth For Image Classification and Object Detection [21.48875255723581]
A mixed-precision quantized neural network with progressively ecreasing bitwidth is proposed to improve the trade-off between accuracy and compression.
Experiments on typical network architectures and benchmark datasets demonstrate that the proposed method could achieve better or comparable results.
arXiv Detail & Related papers (2019-12-29T14:11:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.