NeuralFuse: Learning to Recover the Accuracy of Access-Limited Neural
Network Inference in Low-Voltage Regimes
- URL: http://arxiv.org/abs/2306.16869v2
- Date: Wed, 21 Feb 2024 18:06:01 GMT
- Title: NeuralFuse: Learning to Recover the Accuracy of Access-Limited Neural
Network Inference in Low-Voltage Regimes
- Authors: Hao-Lun Sun, Lei Hsiung, Nandhini Chandramoorthy, Pin-Yu Chen,
Tsung-Yi Ho
- Abstract summary: Deep neural networks (DNNs) have become ubiquitous in machine learning, but their energy consumption remains a notable issue.
We introduce NeuralFuse, a novel add-on module that addresses the accuracy-energy tradeoff in low-voltage regimes.
At a 1% bit error rate, NeuralFuse can reduce memory access energy by up to 24% while recovering accuracy by up to 57%.
- Score: 52.51014498593644
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNNs) have become ubiquitous in machine learning, but
their energy consumption remains a notable issue. Lowering the supply voltage
is an effective strategy for reducing energy consumption. However, aggressively
scaling down the supply voltage can lead to accuracy degradation due to random
bit flips in static random access memory (SRAM) where model parameters are
stored. To address this challenge, we introduce NeuralFuse, a novel add-on
module that addresses the accuracy-energy tradeoff in low-voltage regimes by
learning input transformations to generate error-resistant data
representations. NeuralFuse protects DNN accuracy in both nominal and
low-voltage scenarios. Moreover, NeuralFuse is easy to implement and can be
readily applied to DNNs with limited access, such as non-configurable hardware
or remote access to cloud-based APIs. Experimental results demonstrate that, at
a 1% bit error rate, NeuralFuse can reduce SRAM memory access energy by up to
24% while recovering accuracy by up to 57%. To the best of our knowledge, this
is the first model-agnostic approach (i.e., no model retraining) to address
low-voltage-induced bit errors. The source code is available at
https://github.com/IBM/NeuralFuse.
Related papers
- Improving Reliability of Spiking Neural Networks through Fault Aware
Threshold Voltage Optimization [0.0]
Spiking neural networks (SNNs) have made breakthroughs in computer vision by lending themselves to neuromorphic hardware.
Systolic-array SNN accelerators (systolicSNNs) have been proposed recently, but their reliability is still a major concern.
We present a novel fault mitigation method, i.e., fault-aware threshold voltage optimization in retraining (FalVolt)
arXiv Detail & Related papers (2023-01-12T19:30:21Z) - MEIL-NeRF: Memory-Efficient Incremental Learning of Neural Radiance
Fields [49.68916478541697]
We develop a Memory-Efficient Incremental Learning algorithm for NeRF (MEIL-NeRF)
MEIL-NeRF takes inspiration from NeRF itself in that a neural network can serve as a memory that provides the pixel RGB values, given rays as queries.
As a result, MEIL-NeRF demonstrates constant memory consumption and competitive performance.
arXiv Detail & Related papers (2022-12-16T08:04:56Z) - CorrectNet: Robustness Enhancement of Analog In-Memory Computing for
Neural Networks by Error Suppression and Compensation [4.570841222958966]
We propose a framework to enhance the robustness of neural networks under variations and noise.
We show that inference accuracy of neural networks can be recovered from as low as 1.69% under variations and noise.
arXiv Detail & Related papers (2022-11-27T19:13:33Z) - Variable Bitrate Neural Fields [75.24672452527795]
We present a dictionary method for compressing feature grids, reducing their memory consumption by up to 100x.
We formulate the dictionary optimization as a vector-quantized auto-decoder problem which lets us learn end-to-end discrete neural representations in a space where no direct supervision is available.
arXiv Detail & Related papers (2022-06-15T17:58:34Z) - On the Tradeoff between Energy, Precision, and Accuracy in Federated
Quantized Neural Networks [68.52621234990728]
Federated learning (FL) over wireless networks requires balancing between accuracy, energy efficiency, and precision.
We propose a quantized FL framework that represents data with a finite level of precision in both local training and uplink transmission.
Our framework can reduce energy consumption by up to 53% compared to a standard FL model.
arXiv Detail & Related papers (2021-11-15T17:00:03Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - ReSpawn: Energy-Efficient Fault-Tolerance for Spiking Neural Networks
considering Unreliable Memories [14.933137030206286]
Spiking neural networks (SNNs) have shown a potential for having low energy with unsupervised learning capabilities.
They may suffer from accuracy degradation if their processing is performed under the presence of hardware-induced faults in memories.
We propose ReSpawn, a novel framework for mitigating the negative impacts of faults in both the off-chip and on-chip memories.
arXiv Detail & Related papers (2021-08-23T16:17:33Z) - Enabling Incremental Training with Forward Pass for Edge Devices [0.0]
We introduce a method using evolutionary strategy (ES) that can partially retrain the network enabling it to adapt to changes and recover after an error has occurred.
This technique enables training on an inference-only hardware without the need to use backpropagation and with minimal resource overhead.
arXiv Detail & Related papers (2021-03-25T17:43:04Z) - Bit Error Robustness for Energy-Efficient DNN Accelerators [93.58572811484022]
We show that a combination of robust fixed-point quantization, weight clipping, and random bit error training (RandBET) improves robustness against random bit errors.
This leads to high energy savings from both low-voltage operation as well as low-precision quantization.
arXiv Detail & Related papers (2020-06-24T18:23:10Z) - Towards Explainable Bit Error Tolerance of Resistive RAM-Based Binarized
Neural Networks [7.349786872131006]
Non-volatile memory, such as resistive RAM (RRAM), is an emerging energy-efficient storage.
Binary neural networks (BNNs) can tolerate a certain percentage of errors without a loss in accuracy.
The bit error tolerance (BET) in BNNs can be achieved by flipping the weight signs during training.
arXiv Detail & Related papers (2020-02-03T17:38:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.