Mem-T: Densifying Rewards for Long-Horizon Memory Agents
- URL: http://arxiv.org/abs/2601.23014v1
- Date: Fri, 30 Jan 2026 14:23:33 GMT
- Title: Mem-T: Densifying Rewards for Long-Horizon Memory Agents
- Authors: Yanwei Yue, Guibin Zhang, Boci Peng, Xuanbo Fan, Jiaxin Guo, Qiankun Li, Yan Zhang,
- Abstract summary: We introduce Mem-T, an autonomous memory agent that interfaces with a lightweight hierarchical memory database to perform dynamic updates and multi-turn retrieval over streaming inputs.<n>We also propose MoT-GRPO, a tree-guided reinforcement learning framework that transforms sparse terminal feedback into dense, step-wise supervision via memory operation tree backpropagation and hindsight credit assignment.
- Score: 23.19373149519922
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Memory agents, which depart from predefined memory-processing pipelines by endogenously managing the processing, storage, and retrieval of memories, have garnered increasing attention for their autonomy and adaptability. However, existing training paradigms remain constrained: agents often traverse long-horizon sequences of memory operations before receiving sparse and delayed rewards, which hinders truly end-to-end optimization of memory management policies. To address this limitation, we introduce Mem-T, an autonomous memory agent that interfaces with a lightweight hierarchical memory database to perform dynamic updates and multi-turn retrieval over streaming inputs. To effectively train long-horizon memory management capabilities, we further propose MoT-GRPO, a tree-guided reinforcement learning framework that transforms sparse terminal feedback into dense, step-wise supervision via memory operation tree backpropagation and hindsight credit assignment, thereby enabling the joint optimization of memory construction and retrieval. Extensive experiments demonstrate that Mem-T is (1) high-performing, surpassing frameworks such as A-Mem and Mem0 by up to $14.92\%$, and (2) economical, operating on a favorable accuracy-efficiency Pareto frontier and reducing inference tokens per query by $\sim24.45\%$ relative to GAM without sacrificing performance.
Related papers
- From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents [78.30630000529133]
We propose MM-Mem, a pyramidal multimodal memory architecture grounded in Fuzzy-Trace Theory.<n> MM-Mem memory structures hierarchically into a Sensory Buffer, Episodic Stream, and Symbolic.<n>Experiments confirm the effectiveness of MM-Mem on both offline and streaming tasks.
arXiv Detail & Related papers (2026-03-02T05:12:45Z) - UMEM: Unified Memory Extraction and Management Framework for Generalizable Memory [46.87954895079213]
Self-evolving memory serves as the trainable parameters for Large Language Models (LLMs)<n>Existing methods predominately optimize memory management while treating memory extraction as a static process.<n>We propose Unified Memory Extraction and Management (UMEM) to jointly optimize a Large Language Model to simultaneous extract and manage memories.
arXiv Detail & Related papers (2026-02-11T08:58:41Z) - MemRec: Collaborative Memory-Augmented Agentic Recommender System [57.548438733740504]
We propose MemRec, a framework that architecturally decouples reasoning from memory management.<n>MemRec introduces a dedicated LM_Mem to manage a dynamic collaborative memory graph.<n>It achieves state-of-the-art performance on four benchmarks.
arXiv Detail & Related papers (2026-01-13T18:51:16Z) - Fine-Mem: Fine-Grained Feedback Alignment for Long-Horizon Memory Management [63.48041801851891]
Fine-Mem is a unified framework designed for fine-grained feedback alignment.<n> Experiments on Memalpha and MemoryAgentBench demonstrate that Fine-Mem consistently outperforms strong baselines.
arXiv Detail & Related papers (2026-01-13T11:06:17Z) - Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model Agents [57.38404718635204]
Large language model (LLM) agents face fundamental limitations in long-horizon reasoning due to finite context windows.<n>Existing methods typically handle long-term memory (LTM) and short-term memory (STM) as separate components.<n>We propose Agentic Memory (AgeMem), a unified framework that integrates LTM and STM management directly into the agent's policy.
arXiv Detail & Related papers (2026-01-05T08:24:16Z) - Remember Me, Refine Me: A Dynamic Procedural Memory Framework for Experience-Driven Agent Evolution [52.76038908826961]
We propose $textbfReMe$ ($textitRemember Me, Refine Me$) to bridge the gap between static storage and dynamic reasoning.<n>ReMe innovates across the memory lifecycle via three mechanisms: $textitmulti-faceted distillation$, which extracts fine-grained experiences.<n>Experiments on BFCL-V3 and AppWorld demonstrate that ReMe establishes a new state-of-the-art in agent memory system.
arXiv Detail & Related papers (2025-12-11T14:40:01Z) - MemGen: Weaving Generative Latent Memory for Self-Evolving Agents [57.1835920227202]
We propose MemGen, a dynamic generative memory framework that equips agents with a human-esque cognitive faculty.<n>MemGen enables agents to recall and augment latent memory throughout reasoning, producing a tightly interwoven cycle of memory and cognition.
arXiv Detail & Related papers (2025-09-29T12:33:13Z) - SEDM: Scalable Self-Evolving Distributed Memory for Agents [23.182291416527764]
SEDM is a verifiable and adaptive framework that transforms memory from a passive repository into an active, self-optimizing component.<n>We show that SEDM improves reasoning accuracy while reducing token overhead compared with strong memory baselines.<n>Results highlight SEDM as a scalable and sustainable memory mechanism for open-ended multi-agent collaboration.
arXiv Detail & Related papers (2025-09-11T14:37:37Z) - Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning [89.55738101744657]
Large Language Models (LLMs) have demonstrated impressive capabilities across a wide range of NLP tasks, but they remain fundamentally stateless.<n>We present Memory-R1, a reinforcement learning framework that equips LLMs with the ability to actively manage and utilize external memory.
arXiv Detail & Related papers (2025-08-27T12:26:55Z) - MEMO: Fine-grained Tensor Management For Ultra-long Context LLM Training [24.066283519769968]
Large Language Models (LLMs) have been trained using extended context lengths to foster more creative applications.<n>We propose MEMO, a novel framework for fine-grained activation memory management.<n>MeMO achieves an average of 1.97x and 1.80x MFU compared to Megatron-LM and DeepSpeed.
arXiv Detail & Related papers (2024-07-16T18:59:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.