Related papers: An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration

An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration

URL: http://arxiv.org/abs/2005.03451v2
Date: Wed, 30 Dec 2020 22:40:58 GMT
Title: An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration
Authors: Behzad Salami, Erhan Baturay Onural, Ismail Emir Yuksel, Fahrettin Koc, Oguz Ergin, Adrian Cristal Kestelman, Osman S. Unsal, Hamid Sarbazi-Azad, Onur Mutlu
Abstract summary: Undervolting below a safe voltage level can lead to timing faults due to excessive circuit latency increase. We experimentally study the reduced-voltage operation of multiple components of real FPGAs. We propose techniques to minimize the drawbacks of reduced-voltage operation.
Score: 9.06484009562659
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We empirically evaluate an undervolting technique, i.e., underscaling the circuit supply voltage below the nominal level, to improve the power-efficiency of Convolutional Neural Network (CNN) accelerators mapped to Field Programmable Gate Arrays (FPGAs). Undervolting below a safe voltage level can lead to timing faults due to excessive circuit latency increase. We evaluate the reliability-power trade-off for such accelerators. Specifically, we experimentally study the reduced-voltage operation of multiple components of real FPGAs, characterize the corresponding reliability behavior of CNN accelerators, propose techniques to minimize the drawbacks of reduced-voltage operation, and combine undervolting with architectural CNN optimization techniques, i.e., quantization and pruning. We investigate the effect of environmental temperature on the reliability-power trade-off of such accelerators. We perform experiments on three identical samples of modern Xilinx ZCU102 FPGA platforms with five state-of-the-art image classification CNN benchmarks. This approach allows us to study the effects of our undervolting technique for both software and hardware variability. We achieve more than 3X power-efficiency (GOPs/W) gain via undervolting. 2.6X of this gain is the result of eliminating the voltage guardband region, i.e., the safe voltage region below the nominal level that is set by FPGA vendor to ensure correct functionality in worst-case environmental and circuit conditions. 43% of the power-efficiency gain is due to further undervolting below the guardband, which comes at the cost of accuracy loss in the CNN accelerator. We evaluate an effective frequency underscaling technique that prevents this accuracy loss, and find that it reduces the power-efficiency gain from 43% to 25%.

Related papers

Shavette: Low Power Neural Network Acceleration via Algorithm-level Error Detection and Undervolting [0.0]
This brief introduces a simple approach for enabling reduced voltage operation of Deep Neural Network (DNN) accelerators by mere software modifications. We demonstrate 18% to 25% energy saving with no accuracy loss of the models and negligible throughput compromise.
arXiv Detail & Related papers (2024-10-17T10:29:15Z)
On-Chip Learning with Memristor-Based Neural Networks: Assessing Accuracy and Efficiency Under Device Variations, Conductance Errors, and Input Noise [0.0]
This paper presents a memristor-based compute-in-memory hardware accelerator for on-chip training and inference. Hardware, consisting of 30 memristors and 4 neurons, utilizes three different M-SDC structures with tungsten, chromium, and carbon media to perform binary image classification tasks.
arXiv Detail & Related papers (2024-08-26T23:10:01Z)
Subtractor-Based CNN Inference Accelerator [3.663763133721262]
This paper presents a novel method to boost the performance of CNN inference accelerators by utilizing subtractors. With a rounding size of 0.05, the proposed design can achieve 32.03% power savings and a 24.59% reduction in area at the cost of only 0.1% in terms of accuracy loss.
arXiv Detail & Related papers (2023-10-02T09:15:58Z)
NeuralFuse: Learning to Recover the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes [52.51014498593644]
Deep neural networks (DNNs) have become ubiquitous in machine learning, but their energy consumption remains a notable issue. We introduce NeuralFuse, a novel add-on module that addresses the accuracy-energy tradeoff in low-voltage regimes. At a 1% bit error rate, NeuralFuse can reduce memory access energy by up to 24% while recovering accuracy by up to 57%.
arXiv Detail & Related papers (2023-06-29T11:38:22Z)
Improving Reliability of Spiking Neural Networks through Fault Aware Threshold Voltage Optimization [0.0]
Spiking neural networks (SNNs) have made breakthroughs in computer vision by lending themselves to neuromorphic hardware. Systolic-array SNN accelerators (systolicSNNs) have been proposed recently, but their reliability is still a major concern. We present a novel fault mitigation method, i.e., fault-aware threshold voltage optimization in retraining (FalVolt)
arXiv Detail & Related papers (2023-01-12T19:30:21Z)
HEAT: Hardware-Efficient Automatic Tensor Decomposition for Transformer Compression [69.36555801766762]
We propose a hardware-aware tensor decomposition framework, dubbed HEAT, that enables efficient exploration of the exponential space of possible decompositions. We experimentally show that our hardware-aware factorized BERT variants reduce the energy-delay product by 5.7x with less than 1.1% accuracy loss.
arXiv Detail & Related papers (2022-11-30T05:31:45Z)
Stabilizing Voltage in Power Distribution Networks via Multi-Agent Reinforcement Learning with Transformer [128.19212716007794]
We propose a Transformer-based Multi-Agent Actor-Critic framework (T-MAAC) to stabilize voltage in power distribution networks. In addition, we adopt a novel auxiliary-task training process tailored to the voltage control task, which improves the sample efficiency.
arXiv Detail & Related papers (2022-06-08T07:48:42Z)
AdaViT: Adaptive Tokens for Efficient Vision Transformer [91.88404546243113]
We introduce AdaViT, a method that adaptively adjusts the inference cost of vision transformer (ViT) for images of different complexity. AdaViT achieves this by automatically reducing the number of tokens in vision transformers that are processed in the network as inference proceeds.
arXiv Detail & Related papers (2021-12-14T18:56:07Z)
Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure DNN Accelerators [105.60654479548356]
We show that a combination of robust fixed-point quantization, weight clipping, as well as random bit error training (RandBET) improves robustness against random or adversarial bit errors in quantized DNN weights significantly. This leads to high energy savings for low-voltage operation as well as low-precision quantization, but also improves security of DNN accelerators.
arXiv Detail & Related papers (2021-04-16T19:11:14Z)
Bit Error Robustness for Energy-Efficient DNN Accelerators [93.58572811484022]
We show that a combination of robust fixed-point quantization, weight clipping, and random bit error training (RandBET) improves robustness against random bit errors. This leads to high energy savings from both low-voltage operation as well as low-precision quantization.
arXiv Detail & Related papers (2020-06-24T18:23:10Z)
On the Resilience of Deep Learning for Reduced-voltage FPGAs [1.7998044061364233]
This paper experimentally evaluates the resilience of the training phase of Deep Neural Networks (DNNs) in the presence of voltage underscaling related faults of FPGAs. We have found that modern FPGAs are robust enough in extremely low-voltage levels. Approximately 10% more training are needed to fill the gap in the accuracy.
arXiv Detail & Related papers (2019-12-26T15:08:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.