Low-Precision Training in Logarithmic Number System using Multiplicative
Weight Update
- URL: http://arxiv.org/abs/2106.13914v1
- Date: Sat, 26 Jun 2021 00:32:17 GMT
- Title: Low-Precision Training in Logarithmic Number System using Multiplicative
Weight Update
- Authors: Jiawei Zhao, Steve Dai, Rangharajan Venkatesan, Ming-Yu Liu, Brucek
Khailany, Bill Dally, Anima Anandkumar
- Abstract summary: Training large-scale deep neural networks (DNNs) currently requires a significant amount of energy, leading to serious environmental impacts.
One promising approach to reduce the energy costs is representing DNNs with low-precision numbers.
We jointly design a lowprecision training framework involving a logarithmic number system (LNS) and a multiplicative weight update training method, termed LNS-Madam.
- Score: 49.948082497688404
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training large-scale deep neural networks (DNNs) currently requires a
significant amount of energy, leading to serious environmental impacts. One
promising approach to reduce the energy costs is representing DNNs with
low-precision numbers. While it is common to train DNNs with forward and
backward propagation in low-precision, training directly over low-precision
weights, without keeping a copy of weights in high-precision, still remains to
be an unsolved problem. This is due to complex interactions between learning
algorithms and low-precision number systems. To address this, we jointly design
a low-precision training framework involving a logarithmic number system (LNS)
and a multiplicative weight update training method, termed LNS-Madam. LNS has a
high dynamic range even in a low-bitwidth setting, leading to high energy
efficiency and making it relevant for on-board training in energy-constrained
edge devices. We design LNS to have the flexibility of choosing different bases
for weights and gradients, as they usually require different quantization gaps
and dynamic ranges during training. By drawing the connection between LNS and
multiplicative update, LNS-Madam ensures low quantization error during weight
update, leading to a stable convergence even if the bitwidth is limited.
Compared to using a fixed-point or floating-point number system and training
with popular learning algorithms such as SGD and Adam, our joint design with
LNS and LNS-Madam optimizer achieves better accuracy while requiring smaller
bitwidth. Notably, with only 5-bit for gradients, the proposed training
framework achieves accuracy comparable to full-precision state-of-the-art
models such as ResNet-50 and BERT. After conducting energy estimations by
analyzing the math datapath units during training, the results show that our
design achieves over 60x energy reduction compared to FP32 on BERT models.
Related papers
- Low Precision Quantization-aware Training in Spiking Neural Networks
with Differentiable Quantization Function [0.5046831208137847]
This work aims to bridge the gap between recent progress in quantized neural networks and spiking neural networks.
It presents an extensive study on the performance of the quantization function, represented as a linear combination of sigmoid functions.
The presented quantization function demonstrates the state-of-the-art performance on four popular benchmarks.
arXiv Detail & Related papers (2023-05-30T09:42:05Z) - SPIDE: A Purely Spike-based Method for Training Feedback Spiking Neural
Networks [56.35403810762512]
Spiking neural networks (SNNs) with event-based computation are promising brain-inspired models for energy-efficient applications on neuromorphic hardware.
We study spike-based implicit differentiation on the equilibrium state (SPIDE) that extends the recently proposed training method.
arXiv Detail & Related papers (2023-02-01T04:22:59Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Edge Inference with Fully Differentiable Quantized Mixed Precision
Neural Networks [1.131071436917293]
Quantizing parameters and operations to lower bit-precision offers substantial memory and energy savings for neural network inference.
This paper proposes a new quantization approach for mixed precision convolutional neural networks (CNNs) targeting edge-computing.
arXiv Detail & Related papers (2022-06-15T18:11:37Z) - Low-bit Quantization of Recurrent Neural Network Language Models Using
Alternating Direction Methods of Multipliers [67.688697838109]
This paper presents a novel method to train quantized RNNLMs from scratch using alternating direction methods of multipliers (ADMM)
Experiments on two tasks suggest the proposed ADMM quantization achieved a model size compression factor of up to 31 times over the full precision baseline RNNLMs.
arXiv Detail & Related papers (2021-11-29T09:30:06Z) - FracTrain: Fractionally Squeezing Bit Savings Both Temporally and
Spatially for Efficient DNN Training [81.85361544720885]
We propose FracTrain that integrates progressive fractional quantization which gradually increases the precision of activations, weights, and gradients.
FracTrain reduces computational cost and hardware-quantified energy/latency of DNN training while achieving a comparable or better (-0.12%+1.87%) accuracy.
arXiv Detail & Related papers (2020-12-24T05:24:10Z) - Dynamic Hard Pruning of Neural Networks at the Edge of the Internet [11.605253906375424]
Dynamic Hard Pruning (DynHP) technique incrementally prunes the network during training.
DynHP enables a tunable size reduction of the final neural network and reduces the NN memory occupancy during training.
Freed memory is reused by a emphdynamic batch sizing approach to counterbalance the accuracy degradation caused by the hard pruning strategy.
arXiv Detail & Related papers (2020-11-17T10:23:28Z) - FSpiNN: An Optimization Framework for Memory- and Energy-Efficient
Spiking Neural Networks [14.916996986290902]
Spiking Neural Networks (SNNs) offer unsupervised learning capability due to the spike-timing-dependent plasticity (STDP) rule.
However, state-of-the-art SNNs require a large memory footprint to achieve high accuracy.
We propose FSpiNN, an optimization framework for obtaining memory- and energy-efficient SNNs for training and inference processing.
arXiv Detail & Related papers (2020-07-17T09:40:26Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.