Optimizing for In-memory Deep Learning with Emerging Memory Technology
- URL: http://arxiv.org/abs/2112.00324v1
- Date: Wed, 1 Dec 2021 07:39:18 GMT
- Title: Optimizing for In-memory Deep Learning with Emerging Memory Technology
- Authors: Zhehui Wang, Tao Luo, Rick Siow Mong Goh, Wei Zhang, Weng-Fai Wong
- Abstract summary: In-memory deep learning has already demonstrated orders of magnitude higher performance density and energy efficiency.
The use of emerging memory technology promises to increase the gains in density, energy, and performance even further.
However, emerging memory technology is intrinsically unstable, resulting in random fluctuations of data reads.
This can translate to non-negligible accuracy loss, potentially nullifying the gains.
- Score: 10.176832742078991
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In-memory deep learning computes neural network models where they are stored,
thus avoiding long distance communication between memory and computation units,
resulting in considerable savings in energy and time. In-memory deep learning
has already demonstrated orders of magnitude higher performance density and
energy efficiency. The use of emerging memory technology promises to increase
the gains in density, energy, and performance even further. However, emerging
memory technology is intrinsically unstable, resulting in random fluctuations
of data reads. This can translate to non-negligible accuracy loss, potentially
nullifying the gains. In this paper, we propose three optimization techniques
that can mathematically overcome the instability problem of emerging memory
technology. They can improve the accuracy of the in-memory deep learning model
while maximizing its energy efficiency. Experiments show that our solution can
fully recover most models' state-of-the-art accuracy, and achieves at least an
order of magnitude higher energy efficiency than the state-of-the-art.
Related papers
- Hardware-software co-exploration with racetrack memory based in-memory computing for CNN inference in embedded systems [54.045712360156024]
racetrack memory is a non-volatile technology that allows high data density fabrication.<n>In-memory arithmetic circuits with memory cells affects both the memory density and power efficiency.<n>We present an efficient in-memory convolutional neural network (CNN) accelerator optimized for use with racetrack memory.
arXiv Detail & Related papers (2025-07-02T07:29:53Z) - Breaking Memory Limits: Gradient Wavelet Transform Enhances LLMs Training [45.225732322141994]
Large language models (LLMs) have impressive performance across a range of natural language processing tasks.
Their vast number of parameters introduces significant memory challenges during training.
Existing memory-efficient algorithms often rely on techniques such as singular value decomposition projection or weight freezing.
We propose a novel solution called Gradient Wavelet Transform (GWT), which applies wavelet transforms to gradients in order to significantly reduce the memory requirements.
arXiv Detail & Related papers (2025-01-13T11:35:09Z) - Tensor-GaLore: Memory-Efficient Training via Gradient Tensor Decomposition [93.98343072306619]
We present Navier-GaLore, a novel method for efficient training of neural networks with higher-order tensor weights.
Across various PDE tasks, Navier-GaLore achieves substantial memory savings, reducing memory usage by up to 75%.
arXiv Detail & Related papers (2025-01-04T20:51:51Z) - Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term.
We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents.
Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z) - Topology Optimization of Random Memristors for Input-Aware Dynamic SNN [44.38472635536787]
We introduce pruning optimization for input-aware dynamic memristive spiking neural network (PRIME)
Signal representation-wise, PRIME employs leaky integrate-and-fire neurons to emulate the brain's inherent spiking mechanism.
For reconfigurability, inspired by the brain's dynamic adjustment of computational depth, PRIME employs an input-aware dynamic early stop policy.
arXiv Detail & Related papers (2024-07-26T09:35:02Z) - Efficient and accurate neural field reconstruction using resistive memory [52.68088466453264]
Traditional signal reconstruction methods on digital computers face both software and hardware challenges.
We propose a systematic approach with software-hardware co-optimizations for signal reconstruction from sparse inputs.
This work advances the AI-driven signal restoration technology and paves the way for future efficient and robust medical AI and 3D vision applications.
arXiv Detail & Related papers (2024-04-15T09:33:09Z) - Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model [55.116403765330084]
Current AIGC methods, such as score-based diffusion, are still deficient in terms of rapidity and efficiency.
We propose a time-continuous and analog in-memory neural differential equation solver for score-based diffusion.
We experimentally validate our solution with 180 nm resistive memory in-memory computing macros.
arXiv Detail & Related papers (2024-04-08T16:34:35Z) - Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - Think Before You Act: Decision Transformers with Working Memory [44.18926449252084]
Decision Transformer-based decision-making agents have shown the ability to generalize across multiple tasks.
We argue that this inefficiency stems from the forgetting phenomenon, in which a model memorizes its behaviors in parameters throughout training.
We propose a working memory module to store, blend, and retrieve information for different downstream tasks.
arXiv Detail & Related papers (2023-05-24T01:20:22Z) - A Brain-inspired Memory Transformation based Differentiable Neural
Computer for Reasoning-based Question Answering [3.036382664997076]
Reasoning and question answering as a basic cognitive function for humans is a great challenge for current artificial intelligence.
Motivated by the learning and memory mechanism of the brain, this paper proposed a Memory Transformation based Differentiable Neural Computer (MT-DNC) model.
arXiv Detail & Related papers (2023-01-07T08:39:57Z) - More Is Better: An Analysis of Instance Quantity/Quality Trade-off in
Rehearsal-based Continual Learning [3.9596068699962315]
Continual Learning has become that of addressing the stability-plasticity dilemma of connectionist systems.
We propose an analysis of the memory quantity/quality trade-off adopting various data reduction approaches to increase the number of instances storable in memory.
Our findings suggest that the optimal trade-off is severely skewed toward instance quantity, where rehearsal approaches with several heavily compressed instances easily outperform state-of-the-art approaches.
arXiv Detail & Related papers (2021-05-28T21:05:51Z) - Schematic Memory Persistence and Transience for Efficient and Robust
Continual Learning [8.030924531643532]
Continual learning is considered a promising step towards next-generation Artificial Intelligence (AI)
It is still quite primitive, with existing works focusing primarily on avoiding (catastrophic) forgetting.
We propose a novel framework for continual learning with external memory that builds on recent advances in neuroscience.
arXiv Detail & Related papers (2021-05-05T14:32:47Z) - Improving Computational Efficiency in Visual Reinforcement Learning via
Stored Embeddings [89.63764845984076]
We present Stored Embeddings for Efficient Reinforcement Learning (SEER)
SEER is a simple modification of existing off-policy deep reinforcement learning methods.
We show that SEER does not degrade the performance of RLizable agents while significantly saving computation and memory.
arXiv Detail & Related papers (2021-03-04T08:14:10Z) - SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and
Training [82.35376405568975]
Deep neural networks (DNNs) come with heavy parameterization, leading to external dynamic random-access memory (DRAM) for storage.
We present SmartDeal (SD), an algorithm framework to trade higher-cost memory storage/access for lower-cost computation.
We show that SD leads to 10.56x and 4.48x reduction in the storage and training energy, with negligible accuracy loss compared to state-of-the-art training baselines.
arXiv Detail & Related papers (2021-01-04T18:54:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.