A Co-design view of Compute in-Memory with Non-Volatile Elements for
Neural Networks
- URL: http://arxiv.org/abs/2206.08735v1
- Date: Fri, 3 Jun 2022 15:59:46 GMT
- Title: A Co-design view of Compute in-Memory with Non-Volatile Elements for
Neural Networks
- Authors: Wilfried Haensch, Anand Raghunathan, Kaushik Roy, Bhaswar Chakrabarti,
Charudatta M. Phatak, Cheng Wang and Supratik Guha
- Abstract summary: We discuss how compute-in-memory can play an important part in the next generation of computing hardware.
A non-volatile memory based cross-bar architecture forms the heart of an engine that uses an analog process to parallelize the matrix vector multiplication operation.
The cross-bar architecture, at times referred to as a neuromorphic approach, can be a key hardware element in future computing machines.
- Score: 12.042322495445196
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Learning neural networks are pervasive, but traditional computer
architectures are reaching the limits of being able to efficiently execute them
for the large workloads of today. They are limited by the von Neumann
bottleneck: the high cost in energy and latency incurred in moving data between
memory and the compute engine. Today, special CMOS designs address this
bottleneck. The next generation of computing hardware will need to eliminate or
dramatically mitigate this bottleneck. We discuss how compute-in-memory can
play an important part in this development. Here, a non-volatile memory based
cross-bar architecture forms the heart of an engine that uses an analog process
to parallelize the matrix vector multiplication operation, repeatedly used in
all neural network workloads. The cross-bar architecture, at times referred to
as a neuromorphic approach, can be a key hardware element in future computing
machines. In the first part of this review we take a co-design view of the
design constraints and the demands it places on the new materials and memory
devices that anchor the cross-bar architecture. In the second part, we review
what is knows about the different new non-volatile memory materials and devices
suited for compute in-memory, and discuss the outlook and challenges.
Related papers
- Memory Is All You Need: An Overview of Compute-in-Memory Architectures for Accelerating Large Language Model Inference [2.9302211589186244]
Large language models (LLMs) have transformed natural language processing, enabling machines to generate human-like text and engage in meaningful conversations.
Developments in computing and memory capabilities are lagging behind, exacerbated by the discontinuation of Moore's law.
compute-in-memory (CIM) technologies offer a promising solution for accelerating AI inference by directly performing analog computations in memory.
arXiv Detail & Related papers (2024-06-12T16:57:58Z) - Efficient and accurate neural field reconstruction using resistive memory [52.68088466453264]
Traditional signal reconstruction methods on digital computers face both software and hardware challenges.
We propose a systematic approach with software-hardware co-optimizations for signal reconstruction from sparse inputs.
This work advances the AI-driven signal restoration technology and paves the way for future efficient and robust medical AI and 3D vision applications.
arXiv Detail & Related papers (2024-04-15T09:33:09Z) - Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model [55.116403765330084]
Current AIGC methods, such as score-based diffusion, are still deficient in terms of rapidity and efficiency.
We propose a time-continuous and analog in-memory neural differential equation solver for score-based diffusion.
We experimentally validate our solution with 180 nm resistive memory in-memory computing macros.
arXiv Detail & Related papers (2024-04-08T16:34:35Z) - AI and Memory Wall [81.06494558184049]
We show how memory bandwidth can become the dominant bottleneck for decoder models.
We argue for a redesign in model architecture, training, and deployment strategies to overcome this memory limitation.
arXiv Detail & Related papers (2024-03-21T04:31:59Z) - Topology-aware Embedding Memory for Continual Learning on Expanding Networks [63.35819388164267]
We present a framework to tackle the memory explosion problem using memory replay techniques.
PDGNNs with Topology-aware Embedding Memory (TEM) significantly outperform state-of-the-art techniques.
arXiv Detail & Related papers (2024-01-24T03:03:17Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z) - Robust High-dimensional Memory-augmented Neural Networks [13.82206983716435]
Memory-augmented neural networks enhance neural networks with an explicit memory to overcome these issues.
Access to this explicit memory occurs via soft read and write operations involving every individual memory entry.
We propose a robust architecture that employs a computational memory unit as the explicit memory performing analog in-memory computation on high-dimensional (HD) vectors.
arXiv Detail & Related papers (2020-10-05T12:01:56Z) - In-memory Implementation of On-chip Trainable and Scalable ANN for AI/ML
Applications [0.0]
This paper presents an in-memory computing architecture for ANN enabling artificial intelligence (AI) and machine learning (ML) applications.
Our novel on-chip training and inference in-memory architecture reduces energy cost and enhances throughput by simultaneously accessing the multiple rows of array per precharge cycle.
The proposed architecture was trained and tested on the IRIS dataset which exhibits $46times$ more energy efficient per MAC (multiply and accumulate) operation compared to earlier classifiers.
arXiv Detail & Related papers (2020-05-19T15:36:39Z) - One-step regression and classification with crosspoint resistive memory
arrays [62.997667081978825]
High speed, low energy computing machines are in demand to enable real-time artificial intelligence at the edge.
One-step learning is supported by simulations of the prediction of the cost of a house in Boston and the training of a 2-layer neural network for MNIST digit recognition.
Results are all obtained in one computational step, thanks to the physical, parallel, and analog computing within the crosspoint array.
arXiv Detail & Related papers (2020-05-05T08:00:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.