Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents
- URL: http://arxiv.org/abs/2309.17207v6
- Date: Tue, 21 Jan 2025 10:16:03 GMT
- Title: Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents
- Authors: Marco Pleines, Matthias Pallasch, Frank Zimmer, Mike Preuss,
- Abstract summary: Memory Gym presents a suite of 2D partially observable environments, namely Mortar Mayhem, Mystery Path, and Searing Spotlights.<n>These environments, originally with finite tasks, are expanded into innovative, endless formats, mirroring the escalating challenges of cumulative memory games such as "I packed my bag"
- Score: 4.371266695484245
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Memory Gym presents a suite of 2D partially observable environments, namely Mortar Mayhem, Mystery Path, and Searing Spotlights, designed to benchmark memory capabilities in decision-making agents. These environments, originally with finite tasks, are expanded into innovative, endless formats, mirroring the escalating challenges of cumulative memory games such as "I packed my bag". This progression in task design shifts the focus from merely assessing sample efficiency to also probing the levels of memory effectiveness in dynamic, prolonged scenarios. To address the gap in available memory-based Deep Reinforcement Learning baselines, we introduce an implementation within the open-source CleanRL library that integrates Transformer-XL (TrXL) with Proximal Policy Optimization. This approach utilizes TrXL as a form of episodic memory, employing a sliding window technique. Our comparative study between the Gated Recurrent Unit (GRU) and TrXL reveals varied performances across our finite and endless tasks. TrXL, on the finite environments, demonstrates superior effectiveness over GRU, but only when utilizing an auxiliary loss to reconstruct observations. Notably, GRU makes a remarkable resurgence in all endless tasks, consistently outperforming TrXL by significant margins. Website and Source Code: https://marcometer.github.io/jmlr_2024.github.io/
Related papers
- Mem-T: Densifying Rewards for Long-Horizon Memory Agents [23.19373149519922]
We introduce Mem-T, an autonomous memory agent that interfaces with a lightweight hierarchical memory database to perform dynamic updates and multi-turn retrieval over streaming inputs.<n>We also propose MoT-GRPO, a tree-guided reinforcement learning framework that transforms sparse terminal feedback into dense, step-wise supervision via memory operation tree backpropagation and hindsight credit assignment.
arXiv Detail & Related papers (2026-01-30T14:23:33Z) - Fine-Mem: Fine-Grained Feedback Alignment for Long-Horizon Memory Management [63.48041801851891]
Fine-Mem is a unified framework designed for fine-grained feedback alignment.<n> Experiments on Memalpha and MemoryAgentBench demonstrate that Fine-Mem consistently outperforms strong baselines.
arXiv Detail & Related papers (2026-01-13T11:06:17Z) - Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model Agents [57.38404718635204]
Large language model (LLM) agents face fundamental limitations in long-horizon reasoning due to finite context windows.<n>Existing methods typically handle long-term memory (LTM) and short-term memory (STM) as separate components.<n>We propose Agentic Memory (AgeMem), a unified framework that integrates LTM and STM management directly into the agent's policy.
arXiv Detail & Related papers (2026-01-05T08:24:16Z) - General Agentic Memory Via Deep Research [12.503470046367838]
We propose a novel framework called textbfgeneral agentic memory (GAM).<n>GAM creates optimized contexts for its client at runtime while keeping only simple but useful memory during the offline stage.<n>GAM achieves substantial improvement on various memory-grounded task completion scenarios against existing memory systems.
arXiv Detail & Related papers (2025-11-23T12:29:33Z) - ELMUR: External Layer Memory with Update/Rewrite for Long-Horizon RL [48.214881182054164]
We propose ELMUR, a transformer architecture with structured external memory.<n>ELMUR extends effective horizons up to 100,000 times beyond the attention window.<n>It achieves a 100% success rate on a synthetic T-Maze task with corridors up to one million steps.
arXiv Detail & Related papers (2025-10-08T15:50:34Z) - 3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model [83.70640091897947]
Humans excel at performing complex tasks by leveraging long-term memory across temporal and spatial experiences.<n>Current Large Language Models (LLMs) struggle to effectively plan and act in dynamic, multi-room 3D environments.<n>We propose 3DLLM-Mem, a novel dynamic memory management and fusion model for embodied spatial-temporal reasoning and actions.
arXiv Detail & Related papers (2025-05-28T17:59:13Z) - MEMO: Fine-grained Tensor Management For Ultra-long Context LLM Training [24.066283519769968]
Large Language Models (LLMs) have been trained using extended context lengths to foster more creative applications.
We propose MEMO, a novel framework for fine-grained activation memory management.
MeMO achieves an average of 1.97x and 1.80x MFU compared to Megatron-LM and DeepSpeed.
arXiv Detail & Related papers (2024-07-16T18:59:49Z) - SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation [29.139579820699495]
This work strives to reduce memory overhead in fine-tuning from perspectives of activation function and layer normalization.
We apply our Approx-BP theory to backpropagation training and derive memory-efficient alternatives of GELU and SiLU activation functions.
In addition, we introduce a Memory-Sharing Backpropagation strategy, which enables the activation memory to be shared by two adjacent layers.
arXiv Detail & Related papers (2024-06-24T03:09:15Z) - Topology-aware Embedding Memory for Continual Learning on Expanding Networks [63.35819388164267]
We present a framework to tackle the memory explosion problem using memory replay techniques.
PDGNNs with Topology-aware Embedding Memory (TEM) significantly outperform state-of-the-art techniques.
arXiv Detail & Related papers (2024-01-24T03:03:17Z) - Compressing the Backward Pass of Large-Scale Neural Architectures by
Structured Activation Pruning [0.0]
Sparsity in Deep Neural Networks (DNNs) has gained attention as a solution.
This work focuses on ephemeral sparsity, aiming to reduce memory consumption during training.
We report the effectiveness of activation pruning by evaluating training speed, accuracy, and memory usage of large-scale neural architectures.
arXiv Detail & Related papers (2023-11-28T15:31:31Z) - EdgeMoE: Fast On-Device Inference of MoE-based Large Language Models [3.597163516372061]
EdgeMoE is an on-device inference engine tailored for mixture-of-expert (MoE) LLMs.
It achieves both memory and computational efficiency by strategically partitioning the model across the storage hierarchy.
It demonstrates substantial memory savings and performance improvements when compared to competitive baseline solutions.
arXiv Detail & Related papers (2023-08-28T06:56:08Z) - GLIMMER: generalized late-interaction memory reranker [29.434777627686692]
Memory-augmentation is a powerful approach for incorporating external information into language models.
Recent work introduced LUMEN, a memory-retrieval hybrid that partially pre-computes memory and updates memory representations on the fly with a smaller live encoder.
We propose GLIMMER, which improves on this approach through 1) exploiting free access to the powerful memory representations by applying a shallow reranker on top of memory to drastically improve retrieval quality at low cost.
arXiv Detail & Related papers (2023-06-17T01:54:25Z) - Recurrent Action Transformer with Memory [39.58317527488534]
This paper proposes a novel model architecture that incorporates a recurrent memory mechanism designed to regulate information retention.
We conduct experiments on memory-intensive environments (ViZDoom-Two-Colors, T-Maze, Memory Maze, Minigrid-Memory), classic Atari games, and MuJoCo control environments.
The results show that using memory can significantly improve performance in memory-intensive environments, while maintaining or improving results in classic environments.
arXiv Detail & Related papers (2023-06-15T19:29:08Z) - Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model [89.8764435351222]
We propose a new family of unbiased estimators called WTA-CRS, for matrix production with reduced variance.
Our work provides both theoretical and experimental evidence that, in the context of tuning transformers, our proposed estimators exhibit lower variance compared to existing ones.
arXiv Detail & Related papers (2023-05-24T15:52:08Z) - A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental
Learning [56.450090618578]
Class-Incremental Learning (CIL) aims to train a model with limited memory size to meet this requirement.
We show that when counting the model size into the total budget and comparing methods with aligned memory size, saving models do not consistently work.
We propose a simple yet effective baseline, denoted as MEMO for Memory-efficient Expandable MOdel.
arXiv Detail & Related papers (2022-05-26T08:24:01Z) - Recurrent Dynamic Embedding for Video Object Segmentation [54.52527157232795]
We propose a Recurrent Dynamic Embedding (RDE) to build a memory bank of constant size.
We propose an unbiased guidance loss during the training stage, which makes SAM more robust in long videos.
We also design a novel self-correction strategy so that the network can repair the embeddings of masks with different qualities in the memory bank.
arXiv Detail & Related papers (2022-05-08T02:24:43Z) - Mesa: A Memory-saving Training Framework for Transformers [58.78933015299703]
We present Mesa, a memory-saving training framework for Transformers.
Mesa uses exact activations during forward pass while storing a low-precision version of activations to reduce memory consumption during training.
Experiments on ImageNet, CIFAR-100 and ADE20K demonstrate that Mesa can reduce half of the memory footprints during training.
arXiv Detail & Related papers (2021-11-22T11:23:01Z) - Improving Computational Efficiency in Visual Reinforcement Learning via
Stored Embeddings [89.63764845984076]
We present Stored Embeddings for Efficient Reinforcement Learning (SEER)
SEER is a simple modification of existing off-policy deep reinforcement learning methods.
We show that SEER does not degrade the performance of RLizable agents while significantly saving computation and memory.
arXiv Detail & Related papers (2021-03-04T08:14:10Z) - Semantically Constrained Memory Allocation (SCMA) for Embedding in
Efficient Recommendation Systems [27.419109620575313]
A key challenge for deep learning models is to work with millions of categorical classes or tokens.
We propose a novel formulation of memory shared embedding, where memory is shared in proportion to the overlap in semantic information.
We demonstrate a significant reduction in the memory footprint while maintaining performance.
arXiv Detail & Related papers (2021-02-24T19:55:49Z) - Memformer: A Memory-Augmented Transformer for Sequence Modeling [55.780849185884996]
We present Memformer, an efficient neural network for sequence modeling.
Our model achieves linear time complexity and constant memory space complexity when processing long sequences.
arXiv Detail & Related papers (2020-10-14T09:03:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.