HyMem: Hybrid Memory Architecture with Dynamic Retrieval Scheduling
- URL: http://arxiv.org/abs/2602.13933v1
- Date: Sun, 15 Feb 2026 00:06:19 GMT
- Title: HyMem: Hybrid Memory Architecture with Dynamic Retrieval Scheduling
- Authors: Xiaochen Zhao, Kaikai Wang, Xiaowen Zhang, Chen Yao, Aili Wang,
- Abstract summary: HyMem is a hybrid memory architecture that enables dynamic on-demand scheduling through multi-granular memory representations.<n>We show that HyMem achieves strong performance on both the LOCOMO and LongMemEval benchmarks, outperforming full-context while reducing computational cost by 92.6%.
- Score: 7.24393498822329
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language model (LLM) agents demonstrate strong performance in short-text contexts but often underperform in extended dialogues due to inefficient memory management. Existing approaches face a fundamental trade-off between efficiency and effectiveness: memory compression risks losing critical details required for complex reasoning, while retaining raw text introduces unnecessary computational overhead for simple queries. The crux lies in the limitations of monolithic memory representations and static retrieval mechanisms, which fail to emulate the flexible and proactive memory scheduling capabilities observed in humans, thus struggling to adapt to diverse problem scenarios. Inspired by the principle of cognitive economy, we propose HyMem, a hybrid memory architecture that enables dynamic on-demand scheduling through multi-granular memory representations. HyMem adopts a dual-granular storage scheme paired with a dynamic two-tier retrieval system: a lightweight module constructs summary-level context for efficient response generation, while an LLM-based deep module is selectively activated only for complex queries, augmented by a reflection mechanism for iterative reasoning refinement. Experiments show that HyMem achieves strong performance on both the LOCOMO and LongMemEval benchmarks, outperforming full-context while reducing computational cost by 92.6\%, establishing a state-of-the-art balance between efficiency and performance in long-term memory management.
Related papers
- MemFly: On-the-Fly Memory Optimization via Information Bottleneck [35.420309099411874]
Long-term memory enables large language model agents to tackle complex tasks through historical interactions.<n>Existing frameworks encounter a dilemma between compressing redundant information efficiently and maintaining precise retrieval for downstream tasks.<n>MemFly is a framework grounded in information bottleneck principles that facilitates on-the-fly memory evolution for LLMs.<n>MemFly substantially outperforms state-of-the-art baselines in memory coherence, response fidelity, and accuracy.
arXiv Detail & Related papers (2026-02-08T09:37:25Z) - AMA: Adaptive Memory via Multi-Agent Collaboration [54.490349689939166]
We propose Adaptive Memory via Multi-Agent Collaboration (AMA), a novel framework that leverages coordinated agents to manage memory across multiple granularities.<n>AMA significantly outperforms state-of-the-art baselines while reducing token consumption by approximately 80% compared to full-context methods.
arXiv Detail & Related papers (2026-01-28T08:09:49Z) - MemRec: Collaborative Memory-Augmented Agentic Recommender System [57.548438733740504]
We propose MemRec, a framework that architecturally decouples reasoning from memory management.<n>MemRec introduces a dedicated LM_Mem to manage a dynamic collaborative memory graph.<n>It achieves state-of-the-art performance on four benchmarks.
arXiv Detail & Related papers (2026-01-13T18:51:16Z) - EvolMem: A Cognitive-Driven Benchmark for Multi-Session Dialogue Memory [63.84216832544323]
EvolMem is a new benchmark for assessing multi-session memory capabilities of large language models (LLMs) and agent systems.<n>To construct the benchmark, we introduce a hybrid data synthesis framework that consists of topic-initiated generation and narrative-inspired transformations.<n>Extensive evaluation reveals that no LLM consistently outperforms others across all memory dimensions.
arXiv Detail & Related papers (2026-01-07T03:14:42Z) - SimpleMem: Efficient Lifelong Memory for LLM Agents [73.74399447715052]
We introduce SimpleMem, an efficient memory framework based on semantic lossless compression.<n>We propose a three-stage pipeline designed to maximize information density and token utilization.<n> Experiments on benchmark datasets show that our method consistently outperforms baseline approaches in accuracy, retrieval efficiency, and inference cost.
arXiv Detail & Related papers (2026-01-05T21:02:49Z) - Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model Agents [57.38404718635204]
Large language model (LLM) agents face fundamental limitations in long-horizon reasoning due to finite context windows.<n>Existing methods typically handle long-term memory (LTM) and short-term memory (STM) as separate components.<n>We propose Agentic Memory (AgeMem), a unified framework that integrates LTM and STM management directly into the agent's policy.
arXiv Detail & Related papers (2026-01-05T08:24:16Z) - CAM: A Constructivist View of Agentic Memory for LLM-Based Reading Comprehension [55.29309306566238]
Current Large Language Models (LLMs) are confronted with overwhelming information volume when comprehending long-form documents.<n>This challenge raises the imperative of a cohesive memory module, which can elevate vanilla LLMs into autonomous reading agents.<n>We draw inspiration from Jean Piaget's Constructivist Theory, illuminating three traits of the agentic memory -- structured schemata, flexible assimilation, and dynamic accommodation.
arXiv Detail & Related papers (2025-10-07T02:16:30Z) - Hierarchical Memory for High-Efficiency Long-Term Reasoning in LLM Agents [19.04968632268433]
We propose a hierarchical memory architecture for Large Language Model Agents (LLM Agents)<n>Each memory vector is embedded with a positional index encoding pointing to its semantically related sub-memories in the next layer.<n>During the reasoning phase, an index-based routing mechanism enables efficient, layer-by-layer retrieval without performing exhaustive similarity computations.
arXiv Detail & Related papers (2025-07-23T12:45:44Z) - R$^3$Mem: Bridging Memory Retention and Retrieval via Reversible Compression [24.825945729508682]
We propose R$3$Mem, a memory network that optimize both information Retention and Retrieval.<n>R$3$Mem employs virtual memory tokens to compress and encode infinitely long histories, further enhanced by a hierarchical compression strategy.<n>Experiments demonstrate that our memory design achieves state-of-the-art performance in long-context language modeling and retrieval-augmented generation tasks.
arXiv Detail & Related papers (2025-02-21T21:39:00Z) - QRMeM: Unleash the Length Limitation through Question then Reflection Memory Mechanism [46.441032033076034]
Memory mechanism offers a flexible solution for managing long contexts.
We introduce a novel strategy, Question then Reflection Memory Mechanism (QRMeM), incorporating a dual-structured memory pool.
Our evaluation across multiple-choice questions (MCQ) and multi-document question answering (Multi-doc QA) benchmarks showcases QRMeM enhanced performance compared to existing approaches.
arXiv Detail & Related papers (2024-06-19T02:46:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.