Related papers: Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model Agents

Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model Agents

URL: http://arxiv.org/abs/2601.01885v1
Date: Mon, 05 Jan 2026 08:24:16 GMT
Title: Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model Agents
Authors: Yi Yu, Liuyi Yao, Yuexiang Xie, Qingquan Tan, Jiaqi Feng, Yaliang Li, Libing Wu,
Abstract summary: Large language model (LLM) agents face fundamental limitations in long-horizon reasoning due to finite context windows.<n>Existing methods typically handle long-term memory (LTM) and short-term memory (STM) as separate components.<n>We propose Agentic Memory (AgeMem), a unified framework that integrates LTM and STM management directly into the agent's policy.
Score: 57.38404718635204
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language model (LLM) agents face fundamental limitations in long-horizon reasoning due to finite context windows, making effective memory management critical. Existing methods typically handle long-term memory (LTM) and short-term memory (STM) as separate components, relying on heuristics or auxiliary controllers, which limits adaptability and end-to-end optimization. In this paper, we propose Agentic Memory (AgeMem), a unified framework that integrates LTM and STM management directly into the agent's policy. AgeMem exposes memory operations as tool-based actions, enabling the LLM agent to autonomously decide what and when to store, retrieve, update, summarize, or discard information. To train such unified behaviors, we propose a three-stage progressive reinforcement learning strategy and design a step-wise GRPO to address sparse and discontinuous rewards induced by memory operations. Experiments on five long-horizon benchmarks demonstrate that AgeMem consistently outperforms strong memory-augmented baselines across multiple LLM backbones, achieving improved task performance, higher-quality long-term memory, and more efficient context usage.

Related papers

MemSifter: Offloading LLM Memory Retrieval via Outcome-Driven Proxy Reasoning [78.46301394559903]
Large Language Models (LLMs) are increasingly used for long-duration tasks.<n>Current methods face a trade-off between cost and accuracy.<n>MemSifter is a novel framework that offloads the memory retrieval process to a small-scale proxy model.
arXiv Detail & Related papers (2026-03-03T02:57:38Z)
Mem-T: Densifying Rewards for Long-Horizon Memory Agents [23.19373149519922]
We introduce Mem-T, an autonomous memory agent that interfaces with a lightweight hierarchical memory database to perform dynamic updates and multi-turn retrieval over streaming inputs.<n>We also propose MoT-GRPO, a tree-guided reinforcement learning framework that transforms sparse terminal feedback into dense, step-wise supervision via memory operation tree backpropagation and hindsight credit assignment.
arXiv Detail & Related papers (2026-01-30T14:23:33Z)
MemoryRewardBench: Benchmarking Reward Models for Long-Term Memory Management in Large Language Models [40.965722377085456]
We introduce MemoryRewardBench, the first benchmark to systematically study the ability of reward models to evaluate memory quality.<n> Evaluations on 13 cutting-edge RMs indicate a diminishing performance gap between open-source and proprietary models.
arXiv Detail & Related papers (2026-01-17T09:04:53Z)
Fine-Mem: Fine-Grained Feedback Alignment for Long-Horizon Memory Management [63.48041801851891]
Fine-Mem is a unified framework designed for fine-grained feedback alignment.<n> Experiments on Memalpha and MemoryAgentBench demonstrate that Fine-Mem consistently outperforms strong baselines.
arXiv Detail & Related papers (2026-01-13T11:06:17Z)
Mem-Gallery: Benchmarking Multimodal Long-Term Conversational Memory for MLLM Agents [76.76004970226485]
Long-term memory is a critical capability for multimodal large language model (MLLM) agents.<n>Mem-Gallery is a new benchmark for evaluating multimodal long-term conversational memory in MLLM agents.
arXiv Detail & Related papers (2026-01-07T02:03:13Z)
Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning [89.55738101744657]
Large Language Models (LLMs) have demonstrated impressive capabilities across a wide range of NLP tasks, but they remain fundamentally stateless.<n>We present Memory-R1, a reinforcement learning framework that equips LLMs with the ability to actively manage and utilize external memory.
arXiv Detail & Related papers (2025-08-27T12:26:55Z)
Hierarchical Memory for High-Efficiency Long-Term Reasoning in LLM Agents [19.04968632268433]
We propose a hierarchical memory architecture for Large Language Model Agents (LLM Agents)<n>Each memory vector is embedded with a positional index encoding pointing to its semantically related sub-memories in the next layer.<n>During the reasoning phase, an index-based routing mechanism enables efficient, layer-by-layer retrieval without performing exhaustive similarity computations.
arXiv Detail & Related papers (2025-07-23T12:45:44Z)
MEMO: Fine-grained Tensor Management For Ultra-long Context LLM Training [24.066283519769968]
Large Language Models (LLMs) have been trained using extended context lengths to foster more creative applications.<n>We propose MEMO, a novel framework for fine-grained activation memory management.<n>MeMO achieves an average of 1.97x and 1.80x MFU compared to Megatron-LM and DeepSpeed.
arXiv Detail & Related papers (2024-07-16T18:59:49Z)
SCM: Enhancing Large Language Model with Self-Controlled Memory Framework [54.33686574304374]
Large Language Models (LLMs) are constrained by their inability to process lengthy inputs, resulting in the loss of critical historical information.<n>We propose the Self-Controlled Memory (SCM) framework to enhance the ability of LLMs to maintain long-term memory and recall relevant information.
arXiv Detail & Related papers (2023-04-26T07:25:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.