Related papers: Leveraging Residue Number System for Designing High-Precision Analog Deep Neural Network Accelerators

Leveraging Residue Number System for Designing High-Precision Analog Deep Neural Network Accelerators

URL: http://arxiv.org/abs/2306.09481v1
Date: Thu, 15 Jun 2023 20:24:18 GMT
Title: Leveraging Residue Number System for Designing High-Precision Analog Deep Neural Network Accelerators
Authors: Cansu Demirkiran, Rashmi Agrawal, Vijay Janapa Reddi, Darius Bunandar, and Ajay Joshi
Abstract summary: We use the residue number system (RNS) to compose high-precision operations from multiple low-precision operations. RNS can achieve 99% FP32 accuracy for state-of-the-art DNN inference using data converters with only $6$-bit precision.
Score: 3.4218508703868595
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Achieving high accuracy, while maintaining good energy efficiency, in analog DNN accelerators is challenging as high-precision data converters are expensive. In this paper, we overcome this challenge by using the residue number system (RNS) to compose high-precision operations from multiple low-precision operations. This enables us to eliminate the information loss caused by the limited precision of the ADCs. Our study shows that RNS can achieve 99% FP32 accuracy for state-of-the-art DNN inference using data converters with only $6$-bit precision. We propose using redundant RNS to achieve a fault-tolerant analog accelerator. In addition, we show that RNS can reduce the energy consumption of the data converters within an analog accelerator by several orders of magnitude compared to a regular fixed-point approach.

Related papers

Shavette: Low Power Neural Network Acceleration via Algorithm-level Error Detection and Undervolting [0.0]
This brief introduces a simple approach for enabling reduced voltage operation of Deep Neural Network (DNN) accelerators by mere software modifications. We demonstrate 18% to 25% energy saving with no accuracy loss of the models and negligible throughput compromise.
arXiv Detail & Related papers (2024-10-17T10:29:15Z)
A Pipelined Memristive Neural Network Analog-to-Digital Converter [0.24578723416255754]
This paper proposes a scalable and modular neural network ADC architecture based on a pipeline of four-bit converters. An 8-bit pipelined ADC achieves 0.18 LSB INL, 0.20 LSB DNL, 7.6 ENOB, and 0.97 fJ/conv FOM.
arXiv Detail & Related papers (2024-06-04T10:51:12Z)
Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators [25.100092698906437]
Current hardware still relies on high-accuracy core operations. This is because, so far, the usage of low-precision accumulators led to a significant degradation in performance. We present a simple method to train and fine-tune high-end DNNs, to allow, for the first time, utilization of cheaper, $12$-bits accumulators.
arXiv Detail & Related papers (2024-01-25T11:46:01Z)
Mirage: An RNS-Based Photonic Accelerator for DNN Training [2.2750171530507695]
Photonic computing is a compelling avenue for performing highly efficient matrix multiplication, a crucial operation in Deep Neural Networks (DNNs) This paper proposes Mirage, a photonic DNN training accelerator that overcomes the precision challenges in photonic hardware using the Residue Number System (RNS) RNS is a numeral system based on modular arithmetic, allowing us to perform high-precision operations via multiple low-precision modular operations.
arXiv Detail & Related papers (2023-11-29T02:40:12Z)
A Blueprint for Precise and Fault-Tolerant Analog Neural Networks [1.6039298125810306]
High-precision data converters are costly and impractical for deep neural networks. We address this challenge by using the residue number system (RNS) RNS allows composing high-precision operations from multiple low-precision operations.
arXiv Detail & Related papers (2023-09-19T17:00:34Z)
NeuralFuse: Learning to Recover the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes [52.51014498593644]
Deep neural networks (DNNs) have become ubiquitous in machine learning, but their energy consumption remains a notable issue. We introduce NeuralFuse, a novel add-on module that addresses the accuracy-energy tradeoff in low-voltage regimes. At a 1% bit error rate, NeuralFuse can reduce memory access energy by up to 24% while recovering accuracy by up to 57%.
arXiv Detail & Related papers (2023-06-29T11:38:22Z)
Energy-Efficient Model Compression and Splitting for Collaborative Inference Over Time-Varying Channels [52.60092598312894]
We propose a technique to reduce the total energy bill at the edge device by utilizing model compression and time-varying model split between the edge and remote nodes. Our proposed solution results in minimal energy consumption and $CO$ emission compared to the considered baselines.
arXiv Detail & Related papers (2021-06-02T07:36:27Z)
Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure DNN Accelerators [105.60654479548356]
We show that a combination of robust fixed-point quantization, weight clipping, as well as random bit error training (RandBET) improves robustness against random or adversarial bit errors in quantized DNN weights significantly. This leads to high energy savings for low-voltage operation as well as low-precision quantization, but also improves security of DNN accelerators.
arXiv Detail & Related papers (2021-04-16T19:11:14Z)
Non-Autoregressive Transformer ASR with CTC-Enhanced Decoder Input [54.82369261350497]
We propose a CTC-enhanced NAR transformer, which generates target sequence by refining predictions of the CTC module. Experimental results show that our method outperforms all previous NAR counterparts and achieves 50x faster decoding speed than a strong AR baseline with only 0.0 0.3 absolute CER degradation on Aishell-1 and Aishell-2 datasets.
arXiv Detail & Related papers (2020-10-28T15:00:09Z)
Bit Error Robustness for Energy-Efficient DNN Accelerators [93.58572811484022]
We show that a combination of robust fixed-point quantization, weight clipping, and random bit error training (RandBET) improves robustness against random bit errors. This leads to high energy savings from both low-voltage operation as well as low-precision quantization.
arXiv Detail & Related papers (2020-06-24T18:23:10Z)
Adaptive Anomaly Detection for IoT Data in Hierarchical Edge Computing [71.86955275376604]
We propose an adaptive anomaly detection approach for hierarchical edge computing (HEC) systems to solve this problem. We design an adaptive scheme to select one of the models based on the contextual information extracted from input data, to perform anomaly detection. We evaluate our proposed approach using a real IoT dataset, and demonstrate that it reduces detection delay by 84% while maintaining almost the same accuracy as compared to offloading detection tasks to the cloud.
arXiv Detail & Related papers (2020-01-10T05:29:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.