Related papers: MemSE: Fast MSE Prediction for Noisy Memristor-Based DNN Accelerators

MemSE: Fast MSE Prediction for Noisy Memristor-Based DNN Accelerators

URL: http://arxiv.org/abs/2205.01707v1
Date: Tue, 3 May 2022 18:10:43 GMT
Title: MemSE: Fast MSE Prediction for Noisy Memristor-Based DNN Accelerators
Authors: Jonathan Kern, S\'ebastien Henwood, Gon\c{c}alo Mordido, Elsa Dupraz, Abdeldjalil A\"issa-El-Bey, Yvon Savaria and Fran\c{c}ois Leduc-Primeau
Abstract summary: We theoretically analyze the mean squared error of DNNs that use memristors to compute matrix-vector multiplications (MVM) We take into account both the quantization noise, due to the necessity of reducing the DNN model size, and the programming noise, stemming from the variability during the programming of the memristance value. The proposed method is almost two order of magnitude faster than Monte-Carlo simulation, thus making it possible to optimize the implementation parameters to achieve minimal error for a given power constraint.
Score: 5.553959304125023
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Memristors enable the computation of matrix-vector multiplications (MVM) in memory and, therefore, show great potential in highly increasing the energy efficiency of deep neural network (DNN) inference accelerators. However, computations in memristors suffer from hardware non-idealities and are subject to different sources of noise that may negatively impact system performance. In this work, we theoretically analyze the mean squared error of DNNs that use memristor crossbars to compute MVM. We take into account both the quantization noise, due to the necessity of reducing the DNN model size, and the programming noise, stemming from the variability during the programming of the memristance value. Simulations on pre-trained DNN models showcase the accuracy of the analytical prediction. Furthermore the proposed method is almost two order of magnitude faster than Monte-Carlo simulation, thus making it possible to optimize the implementation parameters to achieve minimal error for a given power constraint.

Related papers

Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval. A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed. The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z)
Scalable Mechanistic Neural Networks [52.28945097811129]
We propose an enhanced neural network framework designed for scientific machine learning applications involving long temporal sequences. By reformulating the original Mechanistic Neural Network (MNN) we reduce the computational time and space complexities from cubic and quadratic with respect to the sequence length, respectively, to linear. Extensive experiments demonstrate that S-MNN matches the original MNN in precision while substantially reducing computational resources.
arXiv Detail & Related papers (2024-10-08T14:27:28Z)
Compute-in-Memory based Neural Network Accelerators for Safety-Critical Systems: Worst-Case Scenarios and Protections [8.813981342105151]
We study the problem of pinpointing the worst-case performance of CiM accelerators affected by device variations. We propose a novel worst-case-aware training technique named A-TRICE that efficiently combines adversarial training and noise-injection training. Our experimental results demonstrate that A-TRICE improves the worst-case accuracy under device variations by up to 33%.
arXiv Detail & Related papers (2023-12-11T05:56:00Z)
A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs) MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z)
Fast and Accurate Reduced-Order Modeling of a MOOSE-based Additive Manufacturing Model with Operator Learning [1.4528756508275622]
The present work is to construct a fast and accurate reduced-order model (ROM) for an additive manufacturing (AM) model. We benchmarked the performance of these OL methods against a conventional deep neural network (DNN)-based ROM.
arXiv Detail & Related papers (2023-08-04T17:00:34Z)
Improving Realistic Worst-Case Performance of NVCiM DNN Accelerators through Training with Right-Censored Gaussian Noise [16.470952550714394]
We propose to use the k-th percentile performance (KPP) to capture the realistic worst-case performance of DNN models executing on CiM accelerators. Our method achieves up to a 26% improvement in KPP compared to the state-of-the-art methods employed to enhance robustness under the impact of device variations.
arXiv Detail & Related papers (2023-07-29T01:06:37Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Mixed Precision Low-bit Quantization of Neural Network Language Models for Speech Recognition [67.95996816744251]
State-of-the-art language models (LMs) represented by long-short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming increasingly complex and expensive for practical applications. Current quantization methods are based on uniform precision and fail to account for the varying performance sensitivity at different parts of LMs to quantization errors. Novel mixed precision neural network LM quantization methods are proposed in this paper.
arXiv Detail & Related papers (2021-11-29T12:24:02Z)
MoEfication: Conditional Computation of Transformer Models for Efficient Inference [66.56994436947441]
Transformer-based pre-trained language models can achieve superior performance on most NLP tasks due to large parameter capacity, but also lead to huge computation cost. We explore to accelerate large-model inference by conditional computation based on the sparse activation phenomenon. We propose to transform a large model into its mixture-of-experts (MoE) version with equal model size, namely MoEfication.
arXiv Detail & Related papers (2021-10-05T02:14:38Z)
A Meta-Learning Approach to the Optimal Power Flow Problem Under Topology Reconfigurations [69.73803123972297]
We propose a DNN-based OPF predictor that is trained using a meta-learning (MTL) approach. The developed OPF-predictor is validated through simulations using benchmark IEEE bus systems.
arXiv Detail & Related papers (2020-12-21T17:39:51Z)
TxSim:Modeling Training of Deep Neural Networks on Resistive Crossbar Systems [3.1887081453726136]
crossbar-based computations face a major challenge due to a variety of device and circuit-level non-idealities. We propose TxSim, a fast and customizable modeling framework to functionally evaluate DNN training on crossbar-based hardware.
arXiv Detail & Related papers (2020-02-25T19:29:43Z)
CSM-NN: Current Source Model Based Logic Circuit Simulation -- A Neural Network Approach [5.365198933008246]
CSM-NN is a scalable simulation framework with optimized neural network structures and processing algorithms. Experiments show that CSM-NN reduces the simulation time by up to $6times$ compared to a state-of-the-art current source model based simulator running on a CPU. CSM-NN also provides high accuracy levels, with less than $2%$ error, compared to HSPICE.
arXiv Detail & Related papers (2020-02-13T00:29:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.