Related papers: Self-Managing DRAM: A Low-Cost Framework for Enabling Autonomous and Efficient in-DRAM Operations

Self-Managing DRAM: A Low-Cost Framework for Enabling Autonomous and Efficient in-DRAM Operations

URL: http://arxiv.org/abs/2207.13358v6
Date: Mon, 22 Apr 2024 07:55:08 GMT
Title: Self-Managing DRAM: A Low-Cost Framework for Enabling Autonomous and Efficient in-DRAM Operations
Authors: Hasan Hassan, Ataberk Olgun, A. Giray Yaglikci, Haocong Luo, Onur Mutlu,
Abstract summary: We propose a new low-cost DRAM architecture that enables implementing new in-DRAM maintenance mechanisms with no further changes in the DRAM interface, memory controller, or other system components. A combination of refresh, RowHammer protection, and memory scrubbing achieve 7.6% speedup and consume 5.2% less DRAM energy on average across 20 memory-intensive four-core workloads.
Score: 7.663876942368506
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The memory controller is in charge of managing DRAM maintenance operations (e.g., refresh, RowHammer protection, memory scrubbing) in current DRAM chips. Implementing new maintenance operations often necessitates modifications in the DRAM interface, memory controller, and potentially other system components. Such modifications are only possible with a new DRAM standard, which takes a long time to develop, leading to slow progress in DRAM systems. In this paper, our goal is to 1) ease, and thus accelerate, the process of enabling new DRAM maintenance operations and 2) enable more efficient in-DRAM maintenance operations. Our idea is to set the memory controller free from managing DRAM maintenance. To this end, we propose Self-Managing DRAM (SMD), a new low-cost DRAM architecture that enables implementing new in-DRAM maintenance mechanisms (or modifying old ones) with no further changes in the DRAM interface, memory controller, or other system components. We use SMD to implement new in-DRAM maintenance mechanisms for three use cases: 1) periodic refresh, 2) RowHammer protection, and 3) memory scrubbing. We show that SMD enables easy adoption of efficient maintenance mechanisms that significantly improve the system performance and energy efficiency while providing higher reliability compared to conventional DDR4 DRAM. A combination of SMD-based maintenance mechanisms that perform refresh, RowHammer protection, and memory scrubbing achieve 7.6% speedup and consume 5.2% less DRAM energy on average across 20 memory-intensive four-core workloads. We make SMD source code openly and freely available at https://github.com/CMU-SAFARI/SelfManagingDRAM.

Related papers

Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation [129.45368843861917]
We introduce the Gated Memory Unit (GMU), a simple yet effective mechanism for efficient memory sharing across layers.<n>We apply it to create SambaY, a decoder-hybrid-decoder architecture that incorporates GMUs to share memory readout states from a Samba-based self-decoder.
arXiv Detail & Related papers (2025-07-09T07:27:00Z)
MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents [84.62985963113245]
We introduce MEM1, an end-to-end reinforcement learning framework that enables agents to operate with constant memory across long multi-turn tasks.<n>At each turn, MEM1 updates a compact shared internal state that jointly supports memory consolidation and reasoning.<n>We show that MEM1-7B improves performance by 3.5x while reducing memory usage by 3.7x compared to Qwen2.5-14B-Instruct on a 16-objective multi-hop QA task.
arXiv Detail & Related papers (2025-06-18T19:44:46Z)
PuDHammer: Experimental Analysis of Read Disturbance Effects of Processing-using-DRAM in Real DRAM Chips [6.537810647501026]
We present the first characterization study of read disturbance effects of multiple-row activation-based PuD (which we call PuDHammer) using 316 real DDR4 DRAM chips.<n>PuDHammer significantly exacerbates the read disturbance vulnerability, causing up to 158.58x reduction in the minimum hammer count required to induce the first bitflip.
arXiv Detail & Related papers (2025-06-15T19:17:50Z)
Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term. We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents. Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z)
Preventing Rowhammer Exploits via Low-Cost Domain-Aware Memory Allocation [46.268703252557316]
Rowhammer is a hardware security vulnerability at the heart of every system with modern DRAM-based memory. C Citadel is a new memory allocator design that prevents Rowhammer-initiated security exploits. C Citadel supports thousands of security domains at a modest 7.4% average memory overhead and no performance loss.
arXiv Detail & Related papers (2024-09-23T18:41:14Z)
Enabling Efficient and Scalable DRAM Read Disturbance Mitigation via New Experimental Insights into Modern DRAM Chips [0.0]
Storage density exacerbates DRAM read disturbance, a circuit-level vulnerability exploited by system-level attacks. Existing defenses are either ineffective or prohibitively expensive. This dissertation tackles two problems: 1) protecting DRAM-based systems becomes more expensive as technology scaling increases read disturbance vulnerability, and 2) many existing solutions depend on proprietary knowledge of DRAM internals.
arXiv Detail & Related papers (2024-08-27T13:12:03Z)
PENDRAM: Enabling High-Performance and Energy-Efficient Processing of Deep Neural Networks through a Generalized DRAM Data Mapping Policy [6.85785397160228]
Convolutional Neural Networks (CNNs) have emerged as a state-of-the-art solution for solving machine learning tasks. CNN accelerators face performance- and energy-efficiency challenges due to high off-chip memory (DRAM) access latency and energy. We present PENDRAM, a novel design space exploration methodology that enables high-performance and energy-efficient CNN acceleration.
arXiv Detail & Related papers (2024-08-05T12:11:09Z)
B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory [91.81390121042192]
We develop a class of models called B'MOJO to seamlessly combine eidetic and fading memory within an composable module. B'MOJO's ability to modulate eidetic and fading memory results in better inference on longer sequences tested up to 32K tokens.
arXiv Detail & Related papers (2024-07-08T18:41:01Z)
Understanding the Security Benefits and Overheads of Emerging Industry Solutions to DRAM Read Disturbance [6.637143975465625]
Per Row Activation Counting (PRAC) mitigation method described in JEDEC DDR5 specification's April 2024 update. Back-off signal propagates from the DRAM chip to the memory controller. RFM commands are issued when needed as opposed to periodically, reducing RFM's overheads.
arXiv Detail & Related papers (2024-06-27T11:22:46Z)
DRAMScope: Uncovering DRAM Microarchitecture and Characteristics by Issuing Memory Commands [6.863346979406863]
This paper presents findings on the microarchitectures of commodity DRAM chips and their impacts on the characteristics of activate-induced bitflips (AIBs) For accurate and efficient reverse-engineering, we use three tools: AIBs, retention time test, and RowCopy, which can be cross-validated. We identify previously unknown AIB vulnerabilities and propose a simple yet effective protection solution.
arXiv Detail & Related papers (2024-05-03T22:10:21Z)
RelayAttention for Efficient Large Language Model Serving with Long System Prompts [59.50256661158862]
This paper aims to improve the efficiency of LLM services that involve long system prompts. handling these system prompts requires heavily redundant memory accesses in existing causal attention algorithms. We propose RelayAttention, an attention algorithm that allows reading hidden states from DRAM exactly once for a batch of input tokens.
arXiv Detail & Related papers (2024-02-22T18:58:28Z)
SparkXD: A Framework for Resilient and Energy-Efficient Spiking Neural Network Inference using Approximate DRAM [15.115813664357436]
Spiking Neural Networks (SNNs) have the potential for achieving low energy consumption due to their biologically sparse computation. Several studies have shown that the off-chip memory (DRAM) accesses are the most energy-consuming operations in SNN processing. We propose SparkXD, a novel framework that provides a comprehensive conjoint solution for resilient and energy-efficient SNN inference.
arXiv Detail & Related papers (2021-02-28T08:12:26Z)
SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and Training [82.35376405568975]
Deep neural networks (DNNs) come with heavy parameterization, leading to external dynamic random-access memory (DRAM) for storage. We present SmartDeal (SD), an algorithm framework to trade higher-cost memory storage/access for lower-cost computation. We show that SD leads to 10.56x and 4.48x reduction in the storage and training energy, with negligible accuracy loss compared to state-of-the-art training baselines.
arXiv Detail & Related papers (2021-01-04T18:54:07Z)
Memformer: A Memory-Augmented Transformer for Sequence Modeling [55.780849185884996]
We present Memformer, an efficient neural network for sequence modeling. Our model achieves linear time complexity and constant memory space complexity when processing long sequences.
arXiv Detail & Related papers (2020-10-14T09:03:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.