An In-Memory Analog Computing Co-Processor for Energy-Efficient CNN
Inference on Mobile Devices
- URL: http://arxiv.org/abs/2105.13904v1
- Date: Mon, 24 May 2021 23:01:36 GMT
- Title: An In-Memory Analog Computing Co-Processor for Energy-Efficient CNN
Inference on Mobile Devices
- Authors: Mohammed Elbtity, Abhishek Singh, Brendan Reidy, Xiaochen Guo, Ramtin
Zand
- Abstract summary: We develop an in-memory analog computing (IMAC) architecture realizing both synaptic behavior and activation functions within non-volatile memory arrays.
Spin-orbit torque magnetoresistive random-access memory (SOT-MRAM) devices are leveraged to realize sigmoidal neurons as well as binarized synapses.
A heterogeneous mixed-signal and mixed-precision CPU-IMAC architecture is proposed for convolutional neural networks (CNNs) inference on mobile processors.
- Score: 4.117012092777604
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we develop an in-memory analog computing (IMAC) architecture
realizing both synaptic behavior and activation functions within non-volatile
memory arrays. Spin-orbit torque magnetoresistive random-access memory
(SOT-MRAM) devices are leveraged to realize sigmoidal neurons as well as
binarized synapses. First, it is shown the proposed IMAC architecture can be
utilized to realize a multilayer perceptron (MLP) classifier achieving orders
of magnitude performance improvement compared to previous mixed-signal and
digital implementations. Next, a heterogeneous mixed-signal and mixed-precision
CPU-IMAC architecture is proposed for convolutional neural networks (CNNs)
inference on mobile processors, in which IMAC is designed as a co-processor to
realize fully-connected (FC) layers whereas convolution layers are executed in
CPU. Architecture-level analytical models are developed to evaluate the
performance and energy consumption of the CPU-IMAC architecture. Simulation
results exhibit 6.5% and 10% energy savings for CPU-IMAC based realizations of
LeNet and VGG CNN models, for MNIST and CIFAR-10 pattern recognition tasks,
respectively.
Related papers
- A Realistic Simulation Framework for Analog/Digital Neuromorphic Architectures [73.65190161312555]
ARCANA is a spiking neural network simulator designed to account for the properties of mixed-signal neuromorphic circuits.
We show how the results obtained provide a reliable estimate of the behavior of the spiking neural network trained in software.
arXiv Detail & Related papers (2024-09-23T11:16:46Z) - GPU-RANC: A CUDA Accelerated Simulation Framework for Neuromorphic Architectures [1.3401966602181168]
We introduce the GPU-based implementation of Reconfigurable Architecture for Neuromorphic Computing (RANC)
We demonstrate up to 780 times speedup compared to serial version of the RANC simulator based on a 512 neuromorphic core MNIST inference application.
arXiv Detail & Related papers (2024-04-24T21:08:21Z) - Efficient and accurate neural field reconstruction using resistive memory [52.68088466453264]
Traditional signal reconstruction methods on digital computers face both software and hardware challenges.
We propose a systematic approach with software-hardware co-optimizations for signal reconstruction from sparse inputs.
This work advances the AI-driven signal restoration technology and paves the way for future efficient and robust medical AI and 3D vision applications.
arXiv Detail & Related papers (2024-04-15T09:33:09Z) - Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model [55.116403765330084]
Current AIGC methods, such as score-based diffusion, are still deficient in terms of rapidity and efficiency.
We propose a time-continuous and analog in-memory neural differential equation solver for score-based diffusion.
We experimentally validate our solution with 180 nm resistive memory in-memory computing macros.
arXiv Detail & Related papers (2024-04-08T16:34:35Z) - Pruning random resistive memory for optimizing analogue AI [54.21621702814583]
AI models present unprecedented challenges to energy consumption and environmental sustainability.
One promising solution is to revisit analogue computing, a technique that predates digital computing.
Here, we report a universal solution, software-hardware co-design using structural plasticity-inspired edge pruning.
arXiv Detail & Related papers (2023-11-13T08:59:01Z) - CIMulator: A Comprehensive Simulation Platform for Computing-In-Memory
Circuit Macros with Low Bit-Width and Real Memory Materials [0.5325753548715747]
This paper presents a simulation platform, namely CIMulator, for quantifying the efficacy of various synaptic devices in neuromorphic accelerators.
Non-volatile memory devices, such as resistive random-access memory, ferroelectric field-effect transistor, and volatile static random-access memory devices, can be selected as synaptic devices.
A multilayer perceptron and convolutional neural networks (CNNs), such as LeNet-5, VGG-16, and a custom CNN named C4W-1, are simulated to evaluate the effects of these synaptic devices on the training and inference outcomes.
arXiv Detail & Related papers (2023-06-26T12:36:07Z) - Heterogeneous Integration of In-Memory Analog Computing Architectures
with Tensor Processing Units [0.0]
This paper introduces a novel, heterogeneous, mixed-signal, and mixed-precision architecture that integrates an IMAC unit with an edge TPU to enhance mobile CNN performance.
We propose a unified learning algorithm that incorporates mixed-precision training techniques to mitigate potential accuracy drops when deploying models on the TPU-IMAC architecture.
arXiv Detail & Related papers (2023-04-18T19:44:56Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - A Heterogeneous In-Memory Computing Cluster For Flexible End-to-End
Inference of Real-World Deep Neural Networks [12.361842554233558]
Deployment of modern TinyML tasks on small battery-constrained IoT devices requires high computational energy efficiency.
Analog In-Memory Computing (IMC) using non-volatile memory (NVM) promises major efficiency improvements in deep neural network (DNN) inference.
We present a heterogeneous tightly-coupled architecture integrating 8 RISC-V cores, an in-memory computing accelerator (IMA), and digital accelerators.
arXiv Detail & Related papers (2022-01-04T11:12:01Z) - A Single-Cycle MLP Classifier Using Analog MRAM-based Neurons and
Synapses [0.0]
MRAM devices are leveraged to realize sigmoidal neurons and binarized synapses for a single-cycle analog in-memory computing architecture.
An analog SOT-MRAM-based neuron bitcell is proposed which achieves a 12x reduction in power-area-product.
An analog IMC architecture achieves at least two and four orders of magnitude performance improvement compared to a mixed-signal analog/digital IMC architecture.
arXiv Detail & Related papers (2020-12-04T16:04:32Z) - One-step regression and classification with crosspoint resistive memory
arrays [62.997667081978825]
High speed, low energy computing machines are in demand to enable real-time artificial intelligence at the edge.
One-step learning is supported by simulations of the prediction of the cost of a house in Boston and the training of a 2-layer neural network for MNIST digit recognition.
Results are all obtained in one computational step, thanks to the physical, parallel, and analog computing within the crosspoint array.
arXiv Detail & Related papers (2020-05-05T08:00:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.