Iterative Training: Finding Binary Weight Deep Neural Networks with
Layer Binarization
- URL: http://arxiv.org/abs/2111.07046v1
- Date: Sat, 13 Nov 2021 05:36:51 GMT
- Title: Iterative Training: Finding Binary Weight Deep Neural Networks with
Layer Binarization
- Authors: Cheng-Chou Lan
- Abstract summary: In low-latency or mobile applications, lower computation complexity, lower memory footprint and better energy efficiency are desired.
Recent work in weight binarization replaces weight-input matrix multiplication with additions.
We show empirically that, starting from partial binary weights instead of from fully binary ones, training reaches fully binary weight networks with better accuracies.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In low-latency or mobile applications, lower computation complexity, lower
memory footprint and better energy efficiency are desired. Many prior works
address this need by removing redundant parameters. Parameter quantization
replaces floating-point arithmetic with lower precision fixed-point arithmetic,
further reducing complexity.
Typical training of quantized weight neural networks starts from fully
quantized weights. Quantization creates random noise. As a way to compensate
for this noise, during training, we propose to quantize some weights while
keeping others in floating-point precision. A deep neural network has many
layers. To arrive at a fully quantized weight network, we start from one
quantized layer and then quantize more and more layers. We show that the order
of layer quantization affects accuracies. Order count is large for deep neural
networks. A sensitivity pre-training is proposed to guide the layer
quantization order.
Recent work in weight binarization replaces weight-input matrix
multiplication with additions. We apply the proposed iterative training to
weight binarization. Our experiments cover fully connected and convolutional
networks on MNIST, CIFAR-10 and ImageNet datasets. We show empirically that,
starting from partial binary weights instead of from fully binary ones,
training reaches fully binary weight networks with better accuracies for larger
and deeper networks. Layer binarization in the forward order results in better
accuracies. Guided layer binarization can further improve that. The
improvements come at a cost of longer training time.
Related papers
- Post-Training Quantization for Re-parameterization via Coarse & Fine
Weight Splitting [13.270381125055275]
We propose a coarse & fine weight splitting (CFWS) method to reduce quantization error of weight.
We develop an improved KL metric to determine optimal quantization scales for activation.
For example, the quantized RepVGG-A1 model exhibits a mere 0.3% accuracy loss.
arXiv Detail & Related papers (2023-12-17T02:31:20Z) - Weight Compander: A Simple Weight Reparameterization for Regularization [5.744133015573047]
We introduce weight compander, a novel effective method to improve generalization of deep neural networks.
We show experimentally that using weight compander in addition to standard regularization methods improves the performance of neural networks.
arXiv Detail & Related papers (2023-06-29T14:52:04Z) - Gradient-based Weight Density Balancing for Robust Dynamic Sparse
Training [59.48691524227352]
Training a sparse neural network from scratch requires optimizing connections at the same time as the connections themselves.
While the connections per layer are optimized multiple times during training, the density of each layer typically remains constant.
We propose Global Gradient-based Redistribution, a technique which distributes weights across all layers - adding more weights to the layers that need them most.
arXiv Detail & Related papers (2022-10-25T13:32:09Z) - BiTAT: Neural Network Binarization with Task-dependent Aggregated
Transformation [116.26521375592759]
Quantization aims to transform high-precision weights and activations of a given neural network into low-precision weights/activations for reduced memory usage and computation.
Extreme quantization (1-bit weight/1-bit activations) of compactly-designed backbone architectures results in severe performance degeneration.
This paper proposes a novel Quantization-Aware Training (QAT) method that can effectively alleviate performance degeneration.
arXiv Detail & Related papers (2022-07-04T13:25:49Z) - Cluster-Promoting Quantization with Bit-Drop for Minimizing Network
Quantization Loss [61.26793005355441]
Cluster-Promoting Quantization (CPQ) finds the optimal quantization grids for neural networks.
DropBits is a new bit-drop technique that revises the standard dropout regularization to randomly drop bits instead of neurons.
We experimentally validate our method on various benchmark datasets and network architectures.
arXiv Detail & Related papers (2021-09-05T15:15:07Z) - $S^3$: Sign-Sparse-Shift Reparametrization for Effective Training of
Low-bit Shift Networks [41.54155265996312]
Shift neural networks reduce complexity by removing expensive multiplication operations and quantizing continuous weights into low-bit discrete values.
Our proposed training method pushes the boundaries of shift neural networks and shows 3-bit shift networks out-performs their full-precision counterparts in terms of top-1 accuracy on ImageNet.
arXiv Detail & Related papers (2021-07-07T19:33:02Z) - Low-Precision Training in Logarithmic Number System using Multiplicative
Weight Update [49.948082497688404]
Training large-scale deep neural networks (DNNs) currently requires a significant amount of energy, leading to serious environmental impacts.
One promising approach to reduce the energy costs is representing DNNs with low-precision numbers.
We jointly design a lowprecision training framework involving a logarithmic number system (LNS) and a multiplicative weight update training method, termed LNS-Madam.
arXiv Detail & Related papers (2021-06-26T00:32:17Z) - Direct Quantization for Training Highly Accurate Low Bit-width Deep
Neural Networks [73.29587731448345]
This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations.
First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights.
Second, to obtain low bit-width activations, existing works consider all channels equally.
arXiv Detail & Related papers (2020-12-26T15:21:18Z) - A Greedy Algorithm for Quantizing Neural Networks [4.683806391173103]
We propose a new computationally efficient method for quantizing the weights of pre- trained neural networks.
Our method deterministically quantizes layers in an iterative fashion with no complicated re-training required.
arXiv Detail & Related papers (2020-10-29T22:53:10Z) - MimicNorm: Weight Mean and Last BN Layer Mimic the Dynamic of Batch
Normalization [60.36100335878855]
We propose a novel normalization method, named MimicNorm, to improve the convergence and efficiency in network training.
We leverage the neural kernel (NTK) theory to prove that our weight mean operation whitens activations and transits network into the chaotic regime like BN layer.
MimicNorm achieves similar accuracy for various network structures, including ResNets and lightweight networks like ShuffleNet, with a reduction of about 20% memory consumption.
arXiv Detail & Related papers (2020-10-19T07:42:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.