Always-On 674uW @ 4GOP/s Error Resilient Binary Neural Networks with
Aggressive SRAM Voltage Scaling on a 22nm IoT End-Node
- URL: http://arxiv.org/abs/2007.08952v1
- Date: Fri, 17 Jul 2020 12:56:58 GMT
- Title: Always-On 674uW @ 4GOP/s Error Resilient Binary Neural Networks with
Aggressive SRAM Voltage Scaling on a 22nm IoT End-Node
- Authors: Alfio Di Mauro, Francesco Conti, Pasquale Davide Schiavone, Davide
Rossi, Luca Benini
- Abstract summary: Binary Neural Networks (BNNs) have been shown to be robust to random bit-level noise, making aggressive voltage scaling attractive.
We introduce the first fully programmable IoT end-node system-on-chip capable of executing hardware-accelerated BNNs at ultra-low voltage.
Our prototype performs 4Gop/s (15.4Inference/s on the CIFAR-10 dataset) by computing up to 13 ops per pJ, achieving 22.8 Inference/s/mW while keeping within a peak power envelope of 674uW.
- Score: 15.974669646920331
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Binary Neural Networks (BNNs) have been shown to be robust to random
bit-level noise, making aggressive voltage scaling attractive as a power-saving
technique for both logic and SRAMs. In this work, we introduce the first fully
programmable IoT end-node system-on-chip (SoC) capable of executing
software-defined, hardware-accelerated BNNs at ultra-low voltage. Our SoC
exploits a hybrid memory scheme where error-vulnerable SRAMs are complemented
by reliable standard-cell memories to safely store critical data under
aggressive voltage scaling. On a prototype in 22nm FDX technology, we
demonstrate that both the logic and SRAM voltage can be dropped to 0.5Vwithout
any accuracy penalty on a BNN trained for the CIFAR-10 dataset, improving
energy efficiency by 2.2X w.r.t. nominal conditions. Furthermore, we show that
the supply voltage can be dropped to 0.42V (50% of nominal) while keeping more
than99% of the nominal accuracy (with a bit error rate ~1/1000). In this
operating point, our prototype performs 4Gop/s (15.4Inference/s on the CIFAR-10
dataset) by computing up to 13binary ops per pJ, achieving 22.8 Inference/s/mW
while keeping within a peak power envelope of 674uW - low enough to enable
always-on operation in ultra-low power smart cameras, long-lifetime
environmental sensors, and insect-sized pico-drones.
Related papers
- IMAGINE: An 8-to-1b 22nm FD-SOI Compute-In-Memory CNN Accelerator With an End-to-End Analog Charge-Based 0.15-8POPS/W Macro Featuring Distribution-Aware Data Reshaping [0.6071203743728119]
We present IMAGINE, a workload-adaptive 1-to-8b CIM-CNN accelerator in 22nm FD-SOI.
It introduces a 1152x256 end-to-end charge-based macro with a multi-bit DP based on an input-serial, weight-parallel accumulation that avoids power-hungry DACs.
Measurement results showcase an 8b system-level energy efficiency of 40TOPS/W at 0.3/0.6V, with competitive accuracies on MNIST and CIFAR-10.
arXiv Detail & Related papers (2024-12-27T17:18:15Z) - DeltaKWS: A 65nm 36nJ/Decision Bio-inspired Temporal-Sparsity-Aware Digital Keyword Spotting IC with 0.6V Near-Threshold SRAM [16.1102923955667]
This paper introduces the first $Delta$RNN-enabled fine-grained temporal sparsity-aware KWS IC for voice-controlled devices.
At 87% temporal sparsity, computing latency and energy/ferencein are reduced by 2.4X/3.4X, respectively.
arXiv Detail & Related papers (2024-05-06T23:41:02Z) - Spiker+: a framework for the generation of efficient Spiking Neural
Networks FPGA accelerators for inference at the edge [49.42371633618761]
Spiker+ is a framework for generating efficient, low-power, and low-area customized Spiking Neural Networks (SNN) accelerators on FPGA for inference at the edge.
Spiker+ is tested on two benchmark datasets, the MNIST and the Spiking Heidelberg Digits (SHD)
arXiv Detail & Related papers (2024-01-02T10:42:42Z) - NeuralFuse: Learning to Recover the Accuracy of Access-Limited Neural
Network Inference in Low-Voltage Regimes [52.51014498593644]
Deep neural networks (DNNs) have become ubiquitous in machine learning, but their energy consumption remains a notable issue.
We introduce NeuralFuse, a novel add-on module that addresses the accuracy-energy tradeoff in low-voltage regimes.
At a 1% bit error rate, NeuralFuse can reduce memory access energy by up to 24% while recovering accuracy by up to 57%.
arXiv Detail & Related papers (2023-06-29T11:38:22Z) - Enhanced physics-constrained deep neural networks for modeling vanadium
redox flow battery [62.997667081978825]
We propose an enhanced version of the physics-constrained deep neural network (PCDNN) approach to provide high-accuracy voltage predictions.
The ePCDNN can accurately capture the voltage response throughout the charge--discharge cycle, including the tail region of the voltage discharge curve.
arXiv Detail & Related papers (2022-03-03T19:56:24Z) - Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure
DNN Accelerators [105.60654479548356]
We show that a combination of robust fixed-point quantization, weight clipping, as well as random bit error training (RandBET) improves robustness against random or adversarial bit errors in quantized DNN weights significantly.
This leads to high energy savings for low-voltage operation as well as low-precision quantization, but also improves security of DNN accelerators.
arXiv Detail & Related papers (2021-04-16T19:11:14Z) - Sound Event Detection with Binary Neural Networks on Tightly
Power-Constrained IoT Devices [20.349809458335532]
Sound event detection (SED) is a hot topic in consumer and smart city applications.
Existing approaches based on Deep Neural Networks are very effective, but highly demanding in terms of memory, power, and throughput.
In this paper, we explore the combination of extreme quantization to a small-print binary neural network (BNN) with the highly energy-efficient, RISC-V-based (8+1)-core GAP8 microcontroller.
arXiv Detail & Related papers (2021-01-12T12:38:23Z) - SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and
Training [82.35376405568975]
Deep neural networks (DNNs) come with heavy parameterization, leading to external dynamic random-access memory (DRAM) for storage.
We present SmartDeal (SD), an algorithm framework to trade higher-cost memory storage/access for lower-cost computation.
We show that SD leads to 10.56x and 4.48x reduction in the storage and training energy, with negligible accuracy loss compared to state-of-the-art training baselines.
arXiv Detail & Related papers (2021-01-04T18:54:07Z) - Bit Error Robustness for Energy-Efficient DNN Accelerators [93.58572811484022]
We show that a combination of robust fixed-point quantization, weight clipping, and random bit error training (RandBET) improves robustness against random bit errors.
This leads to high energy savings from both low-voltage operation as well as low-precision quantization.
arXiv Detail & Related papers (2020-06-24T18:23:10Z) - On the Resilience of Deep Learning for Reduced-voltage FPGAs [1.7998044061364233]
This paper experimentally evaluates the resilience of the training phase of Deep Neural Networks (DNNs) in the presence of voltage underscaling related faults of FPGAs.
We have found that modern FPGAs are robust enough in extremely low-voltage levels.
Approximately 10% more training are needed to fill the gap in the accuracy.
arXiv Detail & Related papers (2019-12-26T15:08:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.