Energy awareness in low precision neural networks
- URL: http://arxiv.org/abs/2202.02783v1
- Date: Sun, 6 Feb 2022 14:44:55 GMT
- Title: Energy awareness in low precision neural networks
- Authors: Nurit Spingarn Eliezer, Ron Banner, Elad Hoffer, Hilla Ben-Yaakov and
Tomer Michaeli
- Abstract summary: Power consumption is a major obstacle in the deployment of deep neural networks (DNNs) on end devices.
We present PANN, a simple approach for approxing any full-precision network by a low-power fixed-precision variant.
In contrast to previous methods, PANN incurs only a minor degradation in accuracy w.r.t. the full-precision version of the network, even when working at the power-budget of a 2-bit quantized variant.
- Score: 41.69995577490698
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Power consumption is a major obstacle in the deployment of deep neural
networks (DNNs) on end devices. Existing approaches for reducing power
consumption rely on quite general principles, including avoidance of
multiplication operations and aggressive quantization of weights and
activations. However, these methods do not take into account the precise power
consumed by each module in the network, and are therefore not optimal. In this
paper we develop accurate power consumption models for all arithmetic
operations in the DNN, under various working conditions. We reveal several
important factors that have been overlooked to date. Based on our analysis, we
present PANN (power-aware neural network), a simple approach for approximating
any full-precision network by a low-power fixed-precision variant. Our method
can be applied to a pre-trained network, and can also be used during training
to achieve improved performance. In contrast to previous methods, PANN incurs
only a minor degradation in accuracy w.r.t. the full-precision version of the
network, even when working at the power-budget of a 2-bit quantized variant. In
addition, our scheme enables to seamlessly traverse the power-accuracy
trade-off at deployment time, which is a major advantage over existing
quantization methods that are constrained to specific bit widths.
Related papers
- Energy Efficient Hardware Acceleration of Neural Networks with
Power-of-Two Quantisation [0.0]
We show that a hardware neural network accelerator with PoT weights implemented on the Zynq UltraScale + MPSoC ZCU104 FPGA can be at least $1.4x$ more energy efficient than the uniform quantisation version.
arXiv Detail & Related papers (2022-09-30T06:33:40Z) - CoNLoCNN: Exploiting Correlation and Non-Uniform Quantization for
Energy-Efficient Low-precision Deep Convolutional Neural Networks [13.520972975766313]
We propose a framework to enable energy-efficient low-precision deep convolutional neural network inference by exploiting non-uniform quantization of weights.
We also propose a novel data representation format, Encoded Low-Precision Binary Signed Digit, to compress the bit-width of weights.
arXiv Detail & Related papers (2022-07-31T01:34:56Z) - Standard Deviation-Based Quantization for Deep Neural Networks [17.495852096822894]
Quantization of deep neural networks is a promising approach that reduces the inference cost.
We propose a new framework to learn the quantization intervals (discrete values) using the knowledge of the network's weight and activation distributions.
Our scheme simultaneously prunes the network's parameters and allows us to flexibly adjust the pruning ratio during the quantization process.
arXiv Detail & Related papers (2022-02-24T23:33:47Z) - On the Tradeoff between Energy, Precision, and Accuracy in Federated
Quantized Neural Networks [68.52621234990728]
Federated learning (FL) over wireless networks requires balancing between accuracy, energy efficiency, and precision.
We propose a quantized FL framework that represents data with a finite level of precision in both local training and uplink transmission.
Our framework can reduce energy consumption by up to 53% compared to a standard FL model.
arXiv Detail & Related papers (2021-11-15T17:00:03Z) - Physics-Informed Neural Networks for AC Optimal Power Flow [0.0]
This paper introduces, for the first time, physics-informed neural networks to accurately estimate the AC-OPF result.
We show how physics-informed neural networks achieve higher accuracy and lower constraint violations than standard neural networks.
arXiv Detail & Related papers (2021-10-06T11:44:59Z) - Low-Precision Training in Logarithmic Number System using Multiplicative
Weight Update [49.948082497688404]
Training large-scale deep neural networks (DNNs) currently requires a significant amount of energy, leading to serious environmental impacts.
One promising approach to reduce the energy costs is representing DNNs with low-precision numbers.
We jointly design a lowprecision training framework involving a logarithmic number system (LNS) and a multiplicative weight update training method, termed LNS-Madam.
arXiv Detail & Related papers (2021-06-26T00:32:17Z) - ShiftAddNet: A Hardware-Inspired Deep Network [87.18216601210763]
ShiftAddNet is an energy-efficient multiplication-less deep neural network.
It leads to both energy-efficient inference and training, without compromising expressive capacity.
ShiftAddNet aggressively reduces over 80% hardware-quantified energy cost of DNNs training and inference, while offering comparable or better accuracies.
arXiv Detail & Related papers (2020-10-24T05:09:14Z) - Finite Versus Infinite Neural Networks: an Empirical Study [69.07049353209463]
kernel methods outperform fully-connected finite-width networks.
Centered and ensembled finite networks have reduced posterior variance.
Weight decay and the use of a large learning rate break the correspondence between finite and infinite networks.
arXiv Detail & Related papers (2020-07-31T01:57:47Z) - A Spike in Performance: Training Hybrid-Spiking Neural Networks with
Quantized Activation Functions [6.574517227976925]
Spiking Neural Network (SNN) is a promising approach to energy-efficient computing.
We show how to maintain state-of-the-art accuracy when converting a non-spiking network into an SNN.
arXiv Detail & Related papers (2020-02-10T05:24:27Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.