What Do You Mean by Memory? When Engineers Are Lost in the Maze of
Complexity
- URL: http://arxiv.org/abs/2312.13462v1
- Date: Wed, 20 Dec 2023 22:26:15 GMT
- Title: What Do You Mean by Memory? When Engineers Are Lost in the Maze of
Complexity
- Authors: Gunnar Kudrjavets (University of Groningen), Aditya Kumar (Google),
Jeff Thomas (Meta Platforms, Inc.), Ayushi Rastogi (University of Groningen)
- Abstract summary: An accepted practice to decrease applications' memory usage is to reduce the amount and frequency of memory allocations.
The industry needs detailed guidelines for optimizing memory usage targeting specific operating systems (OS) and programming language types.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An accepted practice to decrease applications' memory usage is to reduce the
amount and frequency of memory allocations. Factors such as (a) the prevalence
of out-of-memory (OOM) killers, (b) memory allocations in modern programming
languages done implicitly, (c) overcommitting being a default strategy in the
Linux kernel, and (d) the rise in complexity and terminology related to memory
management makes the existing guidance inefficient. The industry needs detailed
guidelines for optimizing memory usage targeting specific operating systems
(OS) and programming language types.
Related papers
- B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory [91.81390121042192]
We develop a class of models called B'MOJO to seamlessly combine eidetic and fading memory within an composable module.
B'MOJO's ability to modulate eidetic and fading memory results in better inference on longer sequences tested up to 32K tokens.
arXiv Detail & Related papers (2024-07-08T18:41:01Z) - Host-Based Allocators for Device Memory [1.2289361708127877]
We pose a model where the allocation algorithm runs on host memory but allocates device memory and so incur the following constraint: the allocator can't read the memory it is allocating.
This means we are unable to use boundary tags, which is a concept that has been ubiquitous in nearly every allocation algorithm.
In this paper, we propose alternate algorithms to work around this constraint, and discuss in general the implications of this system model.
arXiv Detail & Related papers (2024-05-11T19:28:37Z) - MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory [49.96019697955383]
We introduce MemLLM, a novel method of enhancing knowledge capabilities by integrating a structured and explicit read-and-write memory module.
Our experiments indicate that MemLLM enhances performance and interpretability, in language modeling general and in particular.
We see MemLLM as an important step towards making LLMs more grounded and factual through memory augmentation.
arXiv Detail & Related papers (2024-04-17T18:13:16Z) - Augmenting Language Models with Long-Term Memory [142.04940250657637]
Existing large language models (LLMs) can only afford fix-sized inputs due to the input length limit.
We propose a framework, Language Models Augmented with Long-Term Memory (LongMem), which enables LLMs to memorize long history.
arXiv Detail & Related papers (2023-06-12T15:13:39Z) - Pex: Memory-efficient Microcontroller Deep Learning through Partial
Execution [11.336229510791481]
We discuss a novel execution paradigm for microcontroller deep learning.
It modifies the execution of neural networks to avoid materialising full buffers in memory.
This is achieved by exploiting the properties of operators, which can consume/produce a fraction of their input/output at a time.
arXiv Detail & Related papers (2022-11-30T18:47:30Z) - Memory Safe Computations with XLA Compiler [14.510796427699459]
XLA compiler extension adjusts the representation of an algorithm according to a user-specified memory limit.
We show that k-nearest neighbour and sparse Gaussian process regression methods can be run at a much larger scale on a single device.
arXiv Detail & Related papers (2022-06-28T16:59:28Z) - A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental
Learning [56.450090618578]
Class-Incremental Learning (CIL) aims to train a model with limited memory size to meet this requirement.
We show that when counting the model size into the total budget and comparing methods with aligned memory size, saving models do not consistently work.
We propose a simple yet effective baseline, denoted as MEMO for Memory-efficient Expandable MOdel.
arXiv Detail & Related papers (2022-05-26T08:24:01Z) - LaMemo: Language Modeling with Look-Ahead Memory [50.6248714811912]
We propose Look-Ahead Memory (LaMemo) that enhances the recurrence memory by incrementally attending to the right-side tokens.
LaMemo embraces bi-directional attention and segment recurrence with an additional overhead only linearly proportional to the memory length.
Experiments on widely used language modeling benchmarks demonstrate its superiority over the baselines equipped with different types of memory.
arXiv Detail & Related papers (2022-04-15T06:11:25Z) - Memory Planning for Deep Neural Networks [0.0]
We study memory allocation patterns in DNNs during inference.
Latencies incurred due to such textttmutex contention produce undesirable bottlenecks in user-facing services.
We present an implementation of textttMemoMalloc in the PyTorch deep learning framework.
arXiv Detail & Related papers (2022-02-23T05:28:18Z) - Pinpointing the Memory Behaviors of DNN Training [37.78973307051419]
Training of deep neural networks (DNNs) is usually memory-hungry due to the limited device memory capacity of accelerators.
In this work, we pinpoint the memory behaviors of each device memory block of GPU during training by instrumenting the memory allocators of the runtime system.
arXiv Detail & Related papers (2021-04-01T05:30:03Z) - Kanerva++: extending The Kanerva Machine with differentiable, locally
block allocated latent memory [75.65949969000596]
Episodic and semantic memory are critical components of the human memory model.
We develop a new principled Bayesian memory allocation scheme that bridges the gap between episodic and semantic memory.
We demonstrate that this allocation scheme improves performance in memory conditional image generation.
arXiv Detail & Related papers (2021-02-20T18:40:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.