Related papers: Grounding Agent Memory in Contextual Intent

Grounding Agent Memory in Contextual Intent

URL: http://arxiv.org/abs/2601.10702v1
Date: Thu, 15 Jan 2026 18:55:13 GMT
Title: Grounding Agent Memory in Contextual Intent
Authors: Ruozhen Yang, Yucheng Jiang, Yueqi Jiang, Priyanka Kargupta, Yunyi Zhang, Jiawei Han,
Abstract summary: STITCH is a memory system that indexes each trajectory step with a structured retrieval cue, contextual intent, and retrieves history by matching the current step's intent.<n>For evaluation, we introduce CAME-Bench, a benchmark for context-aware retrieval in realistic, dynamic, goal-oriented trajectories.<n>Our analysis shows that intent indexing substantially reduces retrieval noise, supporting intent-aware memory for robust long-horizon reasoning.
Score: 22.299598216046103
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deploying large language models in long-horizon, goal-oriented interactions remains challenging because similar entities and facts recur under different latent goals and constraints, causing memory systems to retrieve context-mismatched evidence. We propose STITCH (Structured Intent Tracking in Contextual History), an agentic memory system that indexes each trajectory step with a structured retrieval cue, contextual intent, and retrieves history by matching the current step's intent. Contextual intent provides compact signals that disambiguate repeated mentions and reduce interference: (1) the current latent goal defining a thematic segment, (2) the action type, and (3) the salient entity types anchoring which attributes matter. During inference, STITCH filters and prioritizes memory snippets by intent compatibility, suppressing semantically similar but context-incompatible history. For evaluation, we introduce CAME-Bench, a benchmark for context-aware retrieval in realistic, dynamic, goal-oriented trajectories. Across CAME-Bench and LongMemEval, STITCH achieves state-of-the-art performance, outperforming the strongest baseline by 35.6%, with the largest gains as trajectory length increases. Our analysis shows that intent indexing substantially reduces retrieval noise, supporting intent-aware memory for robust long-horizon reasoning.

Related papers

LifeBench: A Benchmark for Long-Horizon Multi-Source Memory [22.24847456134897]
We introduce Lifebench, which features densely connected, long-horizon event simulation.<n>Lifebench pushes AI agents beyond simple recall, requiring the integration of declarative and non-declarative memory reasoning.<n>Performance results show that top-tier, state-of-the-art memory systems reach just 55.2% accuracy.
arXiv Detail & Related papers (2026-03-04T06:42:17Z)
AgentLongBench: A Controllable Long Benchmark For Long-Contexts Agents via Environment Rollouts [78.33143446024485]
We introduce textbfAgentLongBench, which evaluates agents through simulated environment rollouts based on Lateral Thinking Puzzles.<n>This framework generates rigorous interaction trajectories across knowledge-intensive and knowledge-free scenarios.
arXiv Detail & Related papers (2026-01-28T16:05:44Z)
AMA: Adaptive Memory via Multi-Agent Collaboration [54.490349689939166]
We propose Adaptive Memory via Multi-Agent Collaboration (AMA), a novel framework that leverages coordinated agents to manage memory across multiple granularities.<n>AMA significantly outperforms state-of-the-art baselines while reducing token consumption by approximately 80% compared to full-context methods.
arXiv Detail & Related papers (2026-01-28T08:09:49Z)
ES-Mem: Event Segmentation-Based Memory for Long-Term Dialogue Agents [25.10969436399974]
ES-Mem is a framework that partitions long-term interactions into semantically coherent events with distinct boundaries.<n>We show that ES-Mem yields consistent performance gains over baseline methods.<n>The proposed event segmentation module exhibits robust applicability on dialogue segmentation datasets.
arXiv Detail & Related papers (2026-01-12T14:33:32Z)
Memory Matters More: Event-Centric Memory as a Logic Map for Agent Searching and Reasoning [55.251697395358285]
Large language models (LLMs) are increasingly deployed as intelligent agents that reason, plan, and interact with their environments.<n>To effectively scale to long-horizon scenarios, a key capability for such agents is a memory mechanism that can retain, organize, and retrieve past experiences.<n>We propose CompassMem, an event-centric memory framework inspired by Event Theory.
arXiv Detail & Related papers (2026-01-08T08:44:07Z)
SimpleMem: Efficient Lifelong Memory for LLM Agents [73.74399447715052]
We introduce SimpleMem, an efficient memory framework based on semantic lossless compression.<n>We propose a three-stage pipeline designed to maximize information density and token utilization.<n> Experiments on benchmark datasets show that our method consistently outperforms baseline approaches in accuracy, retrieval efficiency, and inference cost.
arXiv Detail & Related papers (2026-01-05T21:02:49Z)
AssoMem: Scalable Memory QA with Multi-Signal Associative Retrieval [28.858496175399623]
We propose AssoMem, a novel framework constructing an associative memory graph that anchors dialogue utterances to automatically extracted clues.<n>AssoMem consistently outperforms SOTA baselines, verifying its superiority in context-aware memory recall.
arXiv Detail & Related papers (2025-10-12T01:23:23Z)
WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research [73.58638285105971]
This paper tackles textbfopen-ended deep research (OEDR), a complex challenge where AI agents must synthesize vast web-scale information into insightful reports.<n>We introduce textbfWebWeaver, a novel dual-agent framework that emulates the human research process.<n>Our framework establishes a new state-of-the-art across major OEDR benchmarks, including DeepResearch Bench, DeepConsult, and DeepResearchGym.
arXiv Detail & Related papers (2025-09-16T17:57:21Z)
Semantic Anchoring in Agentic Memory: Leveraging Linguistic Structures for Persistent Conversational Context [0.0]
We propose a hybrid agentic memory architecture that enriches vector-based storage with explicit linguistic cues to improve recall of nuanced, context-rich exchanges.<n>Experiments on adapted long-term dialogue datasets show that semantic anchoring improves factual recall and discourse coherence by up to 18% over strong RAG baselines.
arXiv Detail & Related papers (2025-08-18T05:14:48Z)
ELITE: Embedding-Less retrieval with Iterative Text Exploration [5.8851517822935335]
Large Language Models (LLMs) have achieved impressive progress in natural language processing.<n>Their limited ability to retain long-term context constrains performance on document-level or multi-turn tasks.
arXiv Detail & Related papers (2025-05-17T08:48:43Z)
From Objects to Events: Unlocking Complex Visual Understanding in Object Detectors via LLM-guided Symbolic Reasoning [71.41062111470414]
Current object detectors excel at entity localization and classification, yet exhibit inherent limitations in event recognition capabilities.<n>We present a novel framework that expands the capability of standard object detectors beyond mere object recognition to complex event understanding.<n>Our key innovation lies in bridging the semantic gap between object detection and event understanding without requiring expensive task-specific training.
arXiv Detail & Related papers (2025-02-09T10:30:54Z)
The Devil is in the Spurious Correlations: Boosting Moment Retrieval with Dynamic Learning [49.40254251698784]
We propose a dynamic learning approach for moment retrieval, where two strategies are designed to mitigate the spurious correlation.<n>First, we introduce a novel video synthesis approach to construct a dynamic context for the queried moment.<n>Second, to alleviate the over-association with backgrounds, we enhance representations temporally by incorporating text-dynamics interaction.
arXiv Detail & Related papers (2025-01-13T13:13:06Z)
Coherent Entity Disambiguation via Modeling Topic and Categorical Dependency [87.16283281290053]
Previous entity disambiguation (ED) methods adopt a discriminative paradigm, where prediction is made based on matching scores between mention context and candidate entities. We propose CoherentED, an ED system equipped with novel designs aimed at enhancing the coherence of entity predictions. We achieve new state-of-the-art results on popular ED benchmarks, with an average improvement of 1.3 F1 points.
arXiv Detail & Related papers (2023-11-06T16:40:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.