Self-Managing DRAM: A Low-Cost Framework for Enabling Autonomous and Efficient in-DRAM Operations
- URL: http://arxiv.org/abs/2207.13358v8
- Date: Thu, 17 Oct 2024 11:57:05 GMT
- Title: Self-Managing DRAM: A Low-Cost Framework for Enabling Autonomous and Efficient in-DRAM Operations
- Authors: Hasan Hassan, Ataberk Olgun, A. Giray Yaglikci, Haocong Luo, Onur Mutlu,
- Abstract summary: We propose a new low-cost DRAM architecture, Self-Managing DRAM (SMD), that enables autonomous in-DRAM maintenance operations.
SMD transfers responsibility for controlling maintenance operations from the memory controller to the chip.
We show that it can be implemented without adding new pins to the DDRx interface with low latency and area overhead.
- Score: 7.663876942368506
- License:
- Abstract: The memory controller is in charge of managing DRAM maintenance operations (e.g., refresh, RowHammer protection, memory scrubbing) to reliably operate modern DRAM chips. Implementing new maintenance operations often necessitates modifications in the DRAM interface, memory controller, and potentially other system components. Such modifications are only possible with a new DRAM standard, which takes a long time to develop, likely leading to slow progress in the adoption of new architectural techniques in DRAM chips. We propose a new low-cost DRAM architecture, Self-Managing DRAM (SMD), that enables autonomous in-DRAM maintenance operations by transferring the responsibility for controlling maintenance operations from the memory controller to the SMD chip. To enable autonomous maintenance operations, we make a single modification to the DRAM interface, such that an SMD chip rejects memory controller accesses to DRAM regions under maintenance, while allowing memory accesses to others. Thus, SMD enables 1) implementing new in-DRAM maintenance mechanisms (or modifying existing ones) with no further changes in the DRAM interface or other system components, and 2) overlapping the latency of a maintenance operation in one DRAM region with the latency of accessing data in another. We evaluate SMD and show that it 1) can be implemented without adding new pins to the DDRx interface with low latency and area overhead, 2) achieves 4.1% average speedup across 20 four-core memory-intensive workloads over a DDR4-based system/DRAM co-design technique that intelligently parallelizes maintenance operations with memory accesses, and 3) guarantees forward progress for rejected memory accesses. We believe and hope SMD can enable innovations in DRAM architecture to rapidly come to fruition. We open source all SMD source code and data at https://github.com/CMU-SAFARI/SelfManagingDRAM.
Related papers
- Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term.
We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents.
Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z) - Preventing Rowhammer Exploits via Low-Cost Domain-Aware Memory Allocation [46.268703252557316]
Rowhammer is a hardware security vulnerability at the heart of every system with modern DRAM-based memory.
C Citadel is a new memory allocator design that prevents Rowhammer-initiated security exploits.
C Citadel supports thousands of security domains at a modest 7.4% average memory overhead and no performance loss.
arXiv Detail & Related papers (2024-09-23T18:41:14Z) - Enabling Efficient and Scalable DRAM Read Disturbance Mitigation via New Experimental Insights into Modern DRAM Chips [0.0]
Storage density exacerbates DRAM read disturbance, a circuit-level vulnerability exploited by system-level attacks.
Existing defenses are either ineffective or prohibitively expensive.
This dissertation tackles two problems: 1) protecting DRAM-based systems becomes more expensive as technology scaling increases read disturbance vulnerability, and 2) many existing solutions depend on proprietary knowledge of DRAM internals.
arXiv Detail & Related papers (2024-08-27T13:12:03Z) - PENDRAM: Enabling High-Performance and Energy-Efficient Processing of Deep Neural Networks through a Generalized DRAM Data Mapping Policy [6.85785397160228]
Convolutional Neural Networks (CNNs) have emerged as a state-of-the-art solution for solving machine learning tasks.
CNN accelerators face performance- and energy-efficiency challenges due to high off-chip memory (DRAM) access latency and energy.
We present PENDRAM, a novel design space exploration methodology that enables high-performance and energy-efficient CNN acceleration.
arXiv Detail & Related papers (2024-08-05T12:11:09Z) - B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory [91.81390121042192]
We develop a class of models called B'MOJO to seamlessly combine eidetic and fading memory within an composable module.
B'MOJO's ability to modulate eidetic and fading memory results in better inference on longer sequences tested up to 32K tokens.
arXiv Detail & Related papers (2024-07-08T18:41:01Z) - Understanding the Security Benefits and Overheads of Emerging Industry Solutions to DRAM Read Disturbance [6.637143975465625]
Per Row Activation Counting (PRAC) mitigation method described in JEDEC DDR5 specification's April 2024 update.
Back-off signal propagates from the DRAM chip to the memory controller.
RFM commands are issued when needed as opposed to periodically, reducing RFM's overheads.
arXiv Detail & Related papers (2024-06-27T11:22:46Z) - DRAMScope: Uncovering DRAM Microarchitecture and Characteristics by Issuing Memory Commands [6.863346979406863]
This paper presents findings on the microarchitectures of commodity DRAM chips and their impacts on the characteristics of activate-induced bitflips (AIBs)
For accurate and efficient reverse-engineering, we use three tools: AIBs, retention time test, and RowCopy, which can be cross-validated.
We identify previously unknown AIB vulnerabilities and propose a simple yet effective protection solution.
arXiv Detail & Related papers (2024-05-03T22:10:21Z) - RelayAttention for Efficient Large Language Model Serving with Long System Prompts [59.50256661158862]
This paper aims to improve the efficiency of LLM services that involve long system prompts.
handling these system prompts requires heavily redundant memory accesses in existing causal attention algorithms.
We propose RelayAttention, an attention algorithm that allows reading hidden states from DRAM exactly once for a batch of input tokens.
arXiv Detail & Related papers (2024-02-22T18:58:28Z) - SparkXD: A Framework for Resilient and Energy-Efficient Spiking Neural
Network Inference using Approximate DRAM [15.115813664357436]
Spiking Neural Networks (SNNs) have the potential for achieving low energy consumption due to their biologically sparse computation.
Several studies have shown that the off-chip memory (DRAM) accesses are the most energy-consuming operations in SNN processing.
We propose SparkXD, a novel framework that provides a comprehensive conjoint solution for resilient and energy-efficient SNN inference.
arXiv Detail & Related papers (2021-02-28T08:12:26Z) - SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and
Training [82.35376405568975]
Deep neural networks (DNNs) come with heavy parameterization, leading to external dynamic random-access memory (DRAM) for storage.
We present SmartDeal (SD), an algorithm framework to trade higher-cost memory storage/access for lower-cost computation.
We show that SD leads to 10.56x and 4.48x reduction in the storage and training energy, with negligible accuracy loss compared to state-of-the-art training baselines.
arXiv Detail & Related papers (2021-01-04T18:54:07Z) - Memformer: A Memory-Augmented Transformer for Sequence Modeling [55.780849185884996]
We present Memformer, an efficient neural network for sequence modeling.
Our model achieves linear time complexity and constant memory space complexity when processing long sequences.
arXiv Detail & Related papers (2020-10-14T09:03:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.