Ultra-low Precision Multiplication-free Training for Deep Neural
Networks
- URL: http://arxiv.org/abs/2302.14458v1
- Date: Tue, 28 Feb 2023 10:05:45 GMT
- Title: Ultra-low Precision Multiplication-free Training for Deep Neural
Networks
- Authors: Chang Liu, Rui Zhang, Xishan Zhang, Yifan Hao, Zidong Du, Xing Hu,
Ling Li, Qi Guo
- Abstract summary: In training, the linear layers consume the most energy because of the intense use of energy-consuming full-precision multiplication.
We propose an Adaptive Layer-wise Scaling PoT Quantization (ALS-POTQ) method and a multiplication-Free MAC (MF-MAC) to replace all of the FP32 multiplications.
In our training scheme, all of the above methods do not introduce extra multiplications, so we reduce up to 95.8% of the energy consumption in linear layers during training.
- Score: 20.647925576138807
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The training for deep neural networks (DNNs) demands immense energy
consumption, which restricts the development of deep learning as well as
increases carbon emissions. Thus, the study of energy-efficient training for
DNNs is essential. In training, the linear layers consume the most energy
because of the intense use of energy-consuming full-precision (FP32)
multiplication in multiply-accumulate (MAC). The energy-efficient works try to
decrease the precision of multiplication or replace the multiplication with
energy-efficient operations such as addition or bitwise shift, to reduce the
energy consumption of FP32 multiplications. However, the existing
energy-efficient works cannot replace all of the FP32 multiplications during
both forward and backward propagation with low-precision energy-efficient
operations. In this work, we propose an Adaptive Layer-wise Scaling PoT
Quantization (ALS-POTQ) method and a Multiplication-Free MAC (MF-MAC) to
replace all of the FP32 multiplications with the INT4 additions and 1-bit XOR
operations. In addition, we propose Weight Bias Correction and Parameterized
Ratio Clipping techniques for stable training and improving accuracy. In our
training scheme, all of the above methods do not introduce extra
multiplications, so we reduce up to 95.8% of the energy consumption in linear
layers during training. Experimentally, we achieve an accuracy degradation of
less than 1% for CNN models on ImageNet and Transformer model on the WMT En-De
task. In summary, we significantly outperform the existing methods for both
energy efficiency and accuracy.
Related papers
- Hadamard Domain Training with Integers for Class Incremental Quantized
Learning [1.4416751609100908]
Continual learning can be cost-prohibitive for resource-constraint edge platforms.
We propose a technique that transforms to enable low-precision training with only integer matrix multiplications.
We achieve less than 0.5% and 3% accuracy degradation while we quantize all matrix multiplications inputs down to 4-bits with 8-bit accumulators.
arXiv Detail & Related papers (2023-10-05T16:52:59Z) - Minimizing Energy Consumption of Deep Learning Models by Energy-Aware
Training [26.438415753870917]
We propose EAT, a gradient-based algorithm that aims to reduce energy consumption during model training.
We demonstrate that our energy-aware training algorithm EAT is able to train networks with a better trade-off between classification performance and energy efficiency.
arXiv Detail & Related papers (2023-07-01T15:44:01Z) - DIVISION: Memory Efficient Training via Dual Activation Precision [60.153754740511864]
State-of-the-art work combines a search of quantization bit-width with the training, which makes the procedure complicated and less transparent.
We propose a simple and effective method to compress DNN training.
Experiment results show DIVISION has better comprehensive performance than state-of-the-art methods, including over 10x compression of activation maps and competitive training throughput, without loss of model accuracy.
arXiv Detail & Related papers (2022-08-05T03:15:28Z) - Energy awareness in low precision neural networks [41.69995577490698]
Power consumption is a major obstacle in the deployment of deep neural networks (DNNs) on end devices.
We present PANN, a simple approach for approxing any full-precision network by a low-power fixed-precision variant.
In contrast to previous methods, PANN incurs only a minor degradation in accuracy w.r.t. the full-precision version of the network, even when working at the power-budget of a 2-bit quantized variant.
arXiv Detail & Related papers (2022-02-06T14:44:55Z) - On the Tradeoff between Energy, Precision, and Accuracy in Federated
Quantized Neural Networks [68.52621234990728]
Federated learning (FL) over wireless networks requires balancing between accuracy, energy efficiency, and precision.
We propose a quantized FL framework that represents data with a finite level of precision in both local training and uplink transmission.
Our framework can reduce energy consumption by up to 53% compared to a standard FL model.
arXiv Detail & Related papers (2021-11-15T17:00:03Z) - Positive/Negative Approximate Multipliers for DNN Accelerators [3.1921317895626493]
We present a filter-oriented approximation method to map the weights to the appropriate modes of the approximate multiplier.
Our approach achieves 18.33% energy gains on average across 7 NNs on 4 different datasets for a maximum accuracy drop of only 1%.
arXiv Detail & Related papers (2021-07-20T09:36:24Z) - Low-Precision Training in Logarithmic Number System using Multiplicative
Weight Update [49.948082497688404]
Training large-scale deep neural networks (DNNs) currently requires a significant amount of energy, leading to serious environmental impacts.
One promising approach to reduce the energy costs is representing DNNs with low-precision numbers.
We jointly design a lowprecision training framework involving a logarithmic number system (LNS) and a multiplicative weight update training method, termed LNS-Madam.
arXiv Detail & Related papers (2021-06-26T00:32:17Z) - SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and
Training [82.35376405568975]
Deep neural networks (DNNs) come with heavy parameterization, leading to external dynamic random-access memory (DRAM) for storage.
We present SmartDeal (SD), an algorithm framework to trade higher-cost memory storage/access for lower-cost computation.
We show that SD leads to 10.56x and 4.48x reduction in the storage and training energy, with negligible accuracy loss compared to state-of-the-art training baselines.
arXiv Detail & Related papers (2021-01-04T18:54:07Z) - ShiftAddNet: A Hardware-Inspired Deep Network [87.18216601210763]
ShiftAddNet is an energy-efficient multiplication-less deep neural network.
It leads to both energy-efficient inference and training, without compromising expressive capacity.
ShiftAddNet aggressively reduces over 80% hardware-quantified energy cost of DNNs training and inference, while offering comparable or better accuracies.
arXiv Detail & Related papers (2020-10-24T05:09:14Z) - Bit Error Robustness for Energy-Efficient DNN Accelerators [93.58572811484022]
We show that a combination of robust fixed-point quantization, weight clipping, and random bit error training (RandBET) improves robustness against random bit errors.
This leads to high energy savings from both low-voltage operation as well as low-precision quantization.
arXiv Detail & Related papers (2020-06-24T18:23:10Z) - ESSOP: Efficient and Scalable Stochastic Outer Product Architecture for
Deep Learning [1.2019888796331233]
Matrix-vector multiplications (MVM) and vector-vector outer product (VVOP) are the two most expensive operations associated with the training of deep neural networks (DNNs)
We introduce efficient techniques to SC for weight update in DNNs with the activation functions required by many state-of-the-art networks.
Our architecture reduces the computational cost by re-using random numbers and replacing certain FP multiplication operations by bit shift scaling.
Hardware design of ESSOP at 14nm technology node shows that, compared to a highly pipelined FP16 multiplier, ESSOP is 82.2% and 93.7% better in energy
arXiv Detail & Related papers (2020-03-25T07:54:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.