Related papers: Explicit v.s. Implicit Memory: Exploring Multi-hop Complex Reasoning Over Personalized Information

Explicit v.s. Implicit Memory: Exploring Multi-hop Complex Reasoning Over Personalized Information

URL: http://arxiv.org/abs/2508.13250v1
Date: Mon, 18 Aug 2025 13:34:37 GMT
Title: Explicit v.s. Implicit Memory: Exploring Multi-hop Complex Reasoning Over Personalized Information
Authors: Zeyu Zhang, Yang Zhang, Haoran Tan, Rui Li, Xu Chen,
Abstract summary: In large language model-based agents, memory serves as a critical capability for achieving personalization by storing and utilizing users' information.<n>We propose the multi-hop personalized reasoning task to explore how different memory mechanisms perform in multi-hop reasoning over personalized information.
Score: 13.292751023556221
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In large language model-based agents, memory serves as a critical capability for achieving personalization by storing and utilizing users' information. Although some previous studies have adopted memory to implement user personalization, they typically focus on preference alignment and simple question-answering. However, in the real world, complex tasks often require multi-hop reasoning on a large amount of user information, which poses significant challenges for current memory approaches. To address this limitation, we propose the multi-hop personalized reasoning task to explore how different memory mechanisms perform in multi-hop reasoning over personalized information. We explicitly define this task and construct a dataset along with a unified evaluation framework. Then, we implement various explicit and implicit memory methods and conduct comprehensive experiments. We evaluate their performance on this task from multiple perspectives and analyze their strengths and weaknesses. Besides, we explore hybrid approaches that combine both paradigms and propose the HybridMem method to address their limitations. We demonstrate the effectiveness of our proposed model through extensive experiments. To benefit the research community, we release this project at https://github.com/nuster1128/MPR.

Related papers

RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies [54.23445842621374]
Memory is critical for long-horizon and history-dependent robotic manipulation.<n>Recent vision-language-action (VLA) models have begun to incorporate memory mechanisms.<n>We introduce RoboMME: a large-scale standardized benchmark for evaluating and advancing VLA models.
arXiv Detail & Related papers (2026-03-04T21:59:32Z)
OP-Bench: Benchmarking Over-Personalization for Memory-Augmented Personalized Conversational Agents [55.27061195244624]
We formalize over-personalization into three types: Irrelevance, Repetition, and Sycophancy.<n>Agents tend to retrieve and over-attend to user memories even when unnecessary.<n>Our work takes an initial step toward more controllable and appropriate personalization in memory-augmented dialogue systems.
arXiv Detail & Related papers (2026-01-20T08:27:13Z)
Rethinking Memory Mechanisms of Foundation Agents in the Second Half: A Survey [211.01908189012184]
Memory, with hundreds of papers released this year, emerges as the critical solution to fill the utility gap.<n>We provide a unified view of foundation agent memory along three dimensions.<n>We then analyze how memory is instantiated and operated under different agent topologies.
arXiv Detail & Related papers (2026-01-14T07:38:38Z)
Explore with Long-term Memory: A Benchmark and Multimodal LLM-based Reinforcement Learning Framework for Embodied Exploration [52.35887679314727]
Long-term Memory Embodied Exploration aims to unify the agent's exploratory cognition and decision-making behaviors.<n>To enhance the agent's memory recall and proactive exploration capabilities, we propose MemoryExplorer.
arXiv Detail & Related papers (2026-01-11T16:23:22Z)
EvolMem: A Cognitive-Driven Benchmark for Multi-Session Dialogue Memory [63.84216832544323]
EvolMem is a new benchmark for assessing multi-session memory capabilities of large language models (LLMs) and agent systems.<n>To construct the benchmark, we introduce a hybrid data synthesis framework that consists of topic-initiated generation and narrative-inspired transformations.<n>Extensive evaluation reveals that no LLM consistently outperforms others across all memory dimensions.
arXiv Detail & Related papers (2026-01-07T03:14:42Z)
Evaluating Long-Term Memory for Long-Context Question Answering [100.1267054069757]
We present a systematic evaluation of memory-augmented methods using LoCoMo, a benchmark of synthetic long-context dialogues annotated for question-answering tasks.<n>Our findings show that memory-augmented approaches reduce token usage by over 90% while maintaining competitive accuracy.
arXiv Detail & Related papers (2025-10-27T18:03:50Z)
PRIME: Large Language Model Personalization with Cognitive Memory and Thought Processes [6.631626634132574]
Large language model (LLM) personalization aims to align model outputs with individuals' unique preferences and opinions.<n>We introduce a unified framework, PRIME, using episodic and semantic memory mechanisms.<n>Experiments validate PRIME's effectiveness across both long- and short-context scenarios.
arXiv Detail & Related papers (2025-07-07T01:54:34Z)
FindingDory: A Benchmark to Evaluate Memory in Embodied Agents [49.89792845476579]
We introduce a new benchmark for long-range embodied tasks in the Habitat simulator.<n>This benchmark evaluates memory-based capabilities across 60 tasks requiring sustained engagement and contextual awareness.
arXiv Detail & Related papers (2025-06-18T17:06:28Z)
Embodied Agents Meet Personalization: Exploring Memory Utilization for Personalized Assistance [18.820008753896623]
Embodied agents empowered by large language models (LLMs) have shown strong performance in household object rearrangement tasks.<n>Yet, the effectiveness of embodied agents in utilizing memory for personalized assistance remains largely underexplored.<n>We present MEMENTO, a personalized embodied agent evaluation framework designed to assess memory utilization capabilities.
arXiv Detail & Related papers (2025-05-22T08:00:10Z)
Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions [55.19217798774033]
Memory is a fundamental component of AI systems, underpinning large language models (LLMs)-based agents.<n>In this survey, we first categorize memory representations into parametric and contextual forms.<n>We then introduce six fundamental memory operations: Consolidation, Updating, Indexing, Forgetting, Retrieval, and Compression.
arXiv Detail & Related papers (2025-05-01T17:31:33Z)
On the Structural Memory of LLM Agents [20.529239764968654]
Memory plays a pivotal role in enabling large language model(LLM)-based agents to engage in complex and long-term interactions.<n>This paper investigates how memory structures and memory retrieval methods affect the performance of LLM-based agents.
arXiv Detail & Related papers (2024-12-17T04:30:00Z)
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation [39.69790911626182]
The incorporation of memory into agents is essential for numerous tasks within the domain of Reinforcement Learning (RL)<n>The term memory'' encompasses a wide range of concepts, which, coupled with the lack of a unified methodology for validating an agent's memory, leads to erroneous judgments about agents' memory capabilities.<n>This paper aims to streamline the concept of memory in RL by providing practical precise definitions of agent memory types.
arXiv Detail & Related papers (2024-12-09T14:34:31Z)
Personalized Multimodal Large Language Models: A Survey [127.9521218125761]
Multimodal Large Language Models (MLLMs) have become increasingly important due to their state-of-the-art performance and ability to integrate multiple data modalities.<n>This paper presents a comprehensive survey on personalized multimodal large language models, focusing on their architecture, training methods, and applications.
arXiv Detail & Related papers (2024-12-03T03:59:03Z)
Learning to Learn Variational Semantic Memory [132.39737669936125]
We introduce variational semantic memory into meta-learning to acquire long-term knowledge for few-shot learning. The semantic memory is grown from scratch and gradually consolidated by absorbing information from tasks it experiences. We formulate memory recall as the variational inference of a latent memory variable from addressed contents.
arXiv Detail & Related papers (2020-10-20T15:05:26Z)
Sequential Recommender via Time-aware Attentive Memory Network [67.26862011527986]
We propose a temporal gating methodology to improve attention mechanism and recurrent units. We also propose a Multi-hop Time-aware Attentive Memory network to integrate long-term and short-term preferences. Our approach is scalable for candidate retrieval tasks and can be viewed as a non-linear generalization of latent factorization for dot-product based Top-K recommendation.
arXiv Detail & Related papers (2020-05-18T11:29:38Z)
PeTra: A Sparsely Supervised Memory Model for People Tracking [50.98911178059019]
We propose PeTra, a memory-augmented neural network designed to track entities in its memory slots. We empirically compare key modeling choices, finding that we can simplify several aspects of the design of the memory module while retaining strong performance. PeTra is highly effective in both evaluations, demonstrating its ability to track people in its memory despite being trained with limited annotation.
arXiv Detail & Related papers (2020-05-06T17:45:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.