What Do You Mean by Memory? When Engineers Are Lost in the Maze of
Complexity
- URL: http://arxiv.org/abs/2312.13462v1
- Date: Wed, 20 Dec 2023 22:26:15 GMT
- Title: What Do You Mean by Memory? When Engineers Are Lost in the Maze of
Complexity
- Authors: Gunnar Kudrjavets (University of Groningen), Aditya Kumar (Google),
Jeff Thomas (Meta Platforms, Inc.), Ayushi Rastogi (University of Groningen)
- Abstract summary: An accepted practice to decrease applications' memory usage is to reduce the amount and frequency of memory allocations.
The industry needs detailed guidelines for optimizing memory usage targeting specific operating systems (OS) and programming language types.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An accepted practice to decrease applications' memory usage is to reduce the
amount and frequency of memory allocations. Factors such as (a) the prevalence
of out-of-memory (OOM) killers, (b) memory allocations in modern programming
languages done implicitly, (c) overcommitting being a default strategy in the
Linux kernel, and (d) the rise in complexity and terminology related to memory
management makes the existing guidance inefficient. The industry needs detailed
guidelines for optimizing memory usage targeting specific operating systems
(OS) and programming language types.
Related papers
- Graceful forgetting: Memory as a process [0.0]
A rational theory of memory is proposed to explain how we can accommodate input within bounded storage space.
The theory is intended as an aid to make sense of our extensive knowledge of memory, and bring us closer to an understanding of memory in functional and mechanistic terms.
arXiv Detail & Related papers (2025-02-16T12:46:34Z) - Cost-Efficient Continual Learning with Sufficient Exemplar Memory [55.77835198580209]
Continual learning (CL) research typically assumes highly constrained exemplar memory resources.
In this work, we investigate CL in a novel setting where exemplar memory is ample.
Our method achieves state-of-the-art performance while reducing the computational cost to a quarter or third of existing methods.
arXiv Detail & Related papers (2025-02-11T05:40:52Z) - Host-Based Allocators for Device Memory [1.2289361708127877]
We pose a model where the allocation algorithm runs on host memory but allocates device memory and so incur the following constraint: the allocator can't read the memory it is allocating.
This means we are unable to use boundary tags, which is a concept that has been ubiquitous in nearly every allocation algorithm.
In this paper, we propose alternate algorithms to work around this constraint, and discuss in general the implications of this system model.
arXiv Detail & Related papers (2024-05-11T19:28:37Z) - MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory [49.96019697955383]
We introduce MemLLM, a novel method of enhancing large language models (LLMs) by integrating a structured and explicit read-and-write memory module.
Our experiments indicate that MemLLM enhances the LLM's performance and interpretability, in language modeling in general and knowledge-intensive tasks in particular.
arXiv Detail & Related papers (2024-04-17T18:13:16Z) - Augmenting Language Models with Long-Term Memory [142.04940250657637]
Existing large language models (LLMs) can only afford fix-sized inputs due to the input length limit.
We propose a framework, Language Models Augmented with Long-Term Memory (LongMem), which enables LLMs to memorize long history.
arXiv Detail & Related papers (2023-06-12T15:13:39Z) - Pex: Memory-efficient Microcontroller Deep Learning through Partial
Execution [11.336229510791481]
We discuss a novel execution paradigm for microcontroller deep learning.
It modifies the execution of neural networks to avoid materialising full buffers in memory.
This is achieved by exploiting the properties of operators, which can consume/produce a fraction of their input/output at a time.
arXiv Detail & Related papers (2022-11-30T18:47:30Z) - Memory Safe Computations with XLA Compiler [14.510796427699459]
XLA compiler extension adjusts the representation of an algorithm according to a user-specified memory limit.
We show that k-nearest neighbour and sparse Gaussian process regression methods can be run at a much larger scale on a single device.
arXiv Detail & Related papers (2022-06-28T16:59:28Z) - A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental
Learning [56.450090618578]
Class-Incremental Learning (CIL) aims to train a model with limited memory size to meet this requirement.
We show that when counting the model size into the total budget and comparing methods with aligned memory size, saving models do not consistently work.
We propose a simple yet effective baseline, denoted as MEMO for Memory-efficient Expandable MOdel.
arXiv Detail & Related papers (2022-05-26T08:24:01Z) - LaMemo: Language Modeling with Look-Ahead Memory [50.6248714811912]
We propose Look-Ahead Memory (LaMemo) that enhances the recurrence memory by incrementally attending to the right-side tokens.
LaMemo embraces bi-directional attention and segment recurrence with an additional overhead only linearly proportional to the memory length.
Experiments on widely used language modeling benchmarks demonstrate its superiority over the baselines equipped with different types of memory.
arXiv Detail & Related papers (2022-04-15T06:11:25Z) - Memory Planning for Deep Neural Networks [0.0]
We study memory allocation patterns in DNNs during inference.
Latencies incurred due to such textttmutex contention produce undesirable bottlenecks in user-facing services.
We present an implementation of textttMemoMalloc in the PyTorch deep learning framework.
arXiv Detail & Related papers (2022-02-23T05:28:18Z) - Kanerva++: extending The Kanerva Machine with differentiable, locally
block allocated latent memory [75.65949969000596]
Episodic and semantic memory are critical components of the human memory model.
We develop a new principled Bayesian memory allocation scheme that bridges the gap between episodic and semantic memory.
We demonstrate that this allocation scheme improves performance in memory conditional image generation.
arXiv Detail & Related papers (2021-02-20T18:40:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.