Reliability-Aware Deployment of DNNs on In-Memory Analog Computing
Architectures
- URL: http://arxiv.org/abs/2211.00590v1
- Date: Sun, 2 Oct 2022 01:43:35 GMT
- Title: Reliability-Aware Deployment of DNNs on In-Memory Analog Computing
Architectures
- Authors: Md Hasibul Amin, Mohammed Elbtity, Ramtin Zand
- Abstract summary: In-Memory Analog Computing (IMAC) circuits remove the need for signal converters by realizing both MVM and NLV operations in the analog domain.
We introduce a practical approach to deploy large matrices in deep neural networks (DNNs) onto multiple smaller IMAC subarrays to alleviate the impacts of noise and parasitics.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conventional in-memory computing (IMC) architectures consist of analog
memristive crossbars to accelerate matrix-vector multiplication (MVM), and
digital functional units to realize nonlinear vector (NLV) operations in deep
neural networks (DNNs). These designs, however, require energy-hungry signal
conversion units which can dissipate more than 95% of the total power of the
system. In-Memory Analog Computing (IMAC) circuits, on the other hand, remove
the need for signal converters by realizing both MVM and NLV operations in the
analog domain leading to significant energy savings. However, they are more
susceptible to reliability challenges such as interconnect parasitic and noise.
Here, we introduce a practical approach to deploy large matrices in DNNs onto
multiple smaller IMAC subarrays to alleviate the impacts of noise and
parasitics while keeping the computation in the analog domain.
Related papers
- Neuromorphic Wireless Split Computing with Multi-Level Spikes [69.73249913506042]
In neuromorphic computing, spiking neural networks (SNNs) perform inference tasks, offering significant efficiency gains for workloads involving sequential data.
Recent advances in hardware and software have demonstrated that embedding a few bits of payload in each spike exchanged between the spiking neurons can further enhance inference accuracy.
This paper investigates a wireless neuromorphic split computing architecture employing multi-level SNNs.
arXiv Detail & Related papers (2024-11-07T14:08:35Z) - ARTEMIS: A Mixed Analog-Stochastic In-DRAM Accelerator for Transformer Neural Networks [2.9699290794642366]
ARTEMIS is a mixed analog-stochastic in-DRAM accelerator for transformer models.
Our analysis indicates that ARTEMIS exhibits at least 3.0x speedup, 1.8x lower energy, and 1.9x better energy efficiency compared to GPU, TPU, CPU, and state-of-the-art PIM transformer hardware accelerators.
arXiv Detail & Related papers (2024-07-17T15:08:14Z) - Efficient and accurate neural field reconstruction using resistive memory [52.68088466453264]
Traditional signal reconstruction methods on digital computers face both software and hardware challenges.
We propose a systematic approach with software-hardware co-optimizations for signal reconstruction from sparse inputs.
This work advances the AI-driven signal restoration technology and paves the way for future efficient and robust medical AI and 3D vision applications.
arXiv Detail & Related papers (2024-04-15T09:33:09Z) - RACE-IT: A Reconfigurable Analog CAM-Crossbar Engine for In-Memory
Transformer Acceleration [21.196696191478885]
Transformer models represent the cutting edge of Deep Neural Networks (DNNs)
processing these models demands significant computational resources and results in a substantial memory footprint.
We introduce a novel Analog Content Addressable Memory (ACAM) structure capable of performing various non-MVM operations within Transformers.
arXiv Detail & Related papers (2023-11-29T22:45:39Z) - ADC/DAC-Free Analog Acceleration of Deep Neural Networks with Frequency
Transformation [2.7488316163114823]
This paper proposes a novel approach to an energy-efficient acceleration of frequency-domain neural networks by utilizing analog-domain frequency-based tensor transformations.
Our approach achieves more compact cells by eliminating the need for trainable parameters in the transformation matrix.
On a 16$times$16 crossbars, for 8-bit input processing, the proposed approach achieves the energy efficiency of 1602 tera operations per second per Watt.
arXiv Detail & Related papers (2023-09-04T19:19:39Z) - RWKV: Reinventing RNNs for the Transformer Era [54.716108899349614]
We propose a novel model architecture that combines the efficient parallelizable training of transformers with the efficient inference of RNNs.
We scale our models as large as 14 billion parameters, by far the largest dense RNN ever trained, and find RWKV performs on par with similarly sized Transformers.
arXiv Detail & Related papers (2023-05-22T13:57:41Z) - Signal Detection in MIMO Systems with Hardware Imperfections: Message
Passing on Neural Networks [101.59367762974371]
In this paper, we investigate signal detection in multiple-input-multiple-output (MIMO) communication systems with hardware impairments.
It is difficult to train a deep neural network (DNN) with limited pilot signals, hindering its practical applications.
We design an efficient message passing based Bayesian signal detector, leveraging the unitary approximate message passing (UAMP) algorithm.
arXiv Detail & Related papers (2022-10-08T04:32:58Z) - Over-the-Air Split Machine Learning in Wireless MIMO Networks [56.27831295707334]
In split machine learning (ML), different partitions of a neural network (NN) are executed by different computing nodes.
To ease communication burden, over-the-air computation (OAC) can efficiently implement all or part of the computation at the same time of communication.
arXiv Detail & Related papers (2022-10-07T15:39:11Z) - Neural-PIM: Efficient Processing-In-Memory with Neural Approximation of
Peripherals [11.31429464715989]
This paper presents a new PIM architecture to efficiently accelerate deep learning tasks.
It is proposed to minimize the required A/D conversions with analog accumulation and neural approximated peripheral circuits.
Evaluations on different benchmarks demonstrate that Neural-PIM can improve energy efficiency by 5.36x (1.73x) and speed up throughput by 3.43x (1.59x) without losing accuracy.
arXiv Detail & Related papers (2022-01-30T16:14:49Z) - Interconnect Parasitics and Partitioning in Fully-Analog In-Memory
Computing Architectures [0.0]
We investigate the effect of wire parasitic resistance and capacitance on the accuracy of deep neural network (DNN) models deployed on fully- analog IMC architectures.
We propose a partitioning mechanism to alleviate the impact of the parasitic while keeping the computation in the analog domain.
It is shown that accuracy benefits are achieved at the cost of higher power consumption due to the extra circuitry required for handling partitioning.
arXiv Detail & Related papers (2022-01-29T02:29:27Z) - AnalogNets: ML-HW Co-Design of Noise-robust TinyML Models and Always-On
Analog Compute-in-Memory Accelerator [50.31646817567764]
This work describes TinyML models for the popular always-on applications of keyword spotting (KWS) and visual wake words (VWW)
We detail a comprehensive training methodology, to retain accuracy in the face of analog non-idealities.
We also describe AON-CiM, a programmable, minimal-area phase-change memory (PCM) analog CiM accelerator.
arXiv Detail & Related papers (2021-11-10T10:24:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.