Meta-Memory: Retrieving and Integrating Semantic-Spatial Memories for Robot Spatial Reasoning
- URL: http://arxiv.org/abs/2509.20754v1
- Date: Thu, 25 Sep 2025 05:22:52 GMT
- Title: Meta-Memory: Retrieving and Integrating Semantic-Spatial Memories for Robot Spatial Reasoning
- Authors: Yufan Mao, Hanjing Ye, Wenlong Dong, Chengjie Zhang, Hong Zhang,
- Abstract summary: We propose Meta-Memory, a large language model (LLM)-driven agent that constructs a high-density memory representation of the environment.<n>The key innovation of Meta-Memory lies in its capacity to retrieve and integrate relevant memories through joint reasoning over semantic and spatial modalities.<n> Experimental results show that Meta-Memory significantly outperforms state-of-the-art methods on both the SpaceLocQA and the public NaVQA benchmarks.
- Score: 5.740131013400576
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Navigating complex environments requires robots to effectively store observations as memories and leverage them to answer human queries about spatial locations, which is a critical yet underexplored research challenge. While prior work has made progress in constructing robotic memory, few have addressed the principled mechanisms needed for efficient memory retrieval and integration. To bridge this gap, we propose Meta-Memory, a large language model (LLM)-driven agent that constructs a high-density memory representation of the environment. The key innovation of Meta-Memory lies in its capacity to retrieve and integrate relevant memories through joint reasoning over semantic and spatial modalities in response to natural language location queries, thereby empowering robots with robust and accurate spatial reasoning capabilities. To evaluate its performance, we introduce SpaceLocQA, a large-scale dataset encompassing diverse real-world spatial question-answering scenarios. Experimental results show that Meta-Memory significantly outperforms state-of-the-art methods on both the SpaceLocQA and the public NaVQA benchmarks. Furthermore, we successfully deployed Meta-Memory on real-world robotic platforms, demonstrating its practical utility in complex environments. Project page: https://itsbaymax.github.io/meta-memory.github.io/ .
Related papers
- RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies [54.23445842621374]
Memory is critical for long-horizon and history-dependent robotic manipulation.<n>Recent vision-language-action (VLA) models have begun to incorporate memory mechanisms.<n>We introduce RoboMME: a large-scale standardized benchmark for evaluating and advancing VLA models.
arXiv Detail & Related papers (2026-03-04T21:59:32Z) - RMBench: Memory-Dependent Robotic Manipulation Benchmark with Insights into Policy Design [77.30163153176954]
RMBench is a simulation benchmark comprising 9 manipulation tasks that span multiple levels of memory complexity.<n>Mem-0 is a modular manipulation policy with explicit memory components designed to support controlled ablation studies.<n>We identify memory-related limitations in existing policies and provide empirical insights into how architectural design choices influence memory performance.
arXiv Detail & Related papers (2026-03-01T18:59:59Z) - MolmoSpaces: A Large-Scale Open Ecosystem for Robot Navigation and Manipulation [56.30931340537373]
MolmoSpaces is a fully open ecosystem to support benchmarking of robot policies.<n>MolmoSpaces consists of over 230k diverse indoor environments.<n>MolmoSpaces-Bench is a benchmark suite of 8 tasks in which robots interact with our diverse scenes and richly annotated objects.
arXiv Detail & Related papers (2026-02-11T20:16:31Z) - Rethinking Memory Mechanisms of Foundation Agents in the Second Half: A Survey [211.01908189012184]
Memory, with hundreds of papers released this year, emerges as the critical solution to fill the utility gap.<n>We provide a unified view of foundation agent memory along three dimensions.<n>We then analyze how memory is instantiated and operated under different agent topologies.
arXiv Detail & Related papers (2026-01-14T07:38:38Z) - The AI Hippocampus: How Far are We From Human Memory? [77.04745635827278]
Implicit memory refers to the knowledge embedded within the internal parameters of pre-trained transformers.<n>Explicit memory involves external storage and retrieval components designed to augment model outputs with dynamic, queryable knowledge representations.<n>Agentic memory introduces persistent, temporally extended memory structures within autonomous agents.
arXiv Detail & Related papers (2026-01-14T03:24:08Z) - RealMem: Benchmarking LLMs in Real-World Memory-Driven Interaction [21.670389104174536]
We introduce **RealMem**, the first benchmark grounded in realistic project scenarios.<n>RealMem comprises over 2,000 cross-session dialogues across eleven scenarios, utilizing natural user queries for evaluation.<n>We propose a pipeline that integrates Project Foundation Construction, Multi-Agent Dialogue Generation, and Memory synthesis and Schedule Management to simulate the dynamic evolution of memory.
arXiv Detail & Related papers (2026-01-11T15:49:36Z) - RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Interactive Environmental Learning in Physical Embodied Systems [41.89907261427986]
Embodied agents face persistent challenges in real-world environments, including partial observability, limited spatial reasoning, and high-latency multi-memory integration.<n>We present RoboMemory, a brain-inspired framework that unifies Spatial, Temporal, Episodic, and Semantic memory under a parallelized architecture for efficient long-horizon planning and interactive environmental learning.<n> Experiments on EmbodiedBench show that RoboMemory, built on Qwen2.5-VL-72B-Ins, improves average success rates by 25% over its baseline and exceeds the closed-source state-of-the-art (SOTA) Gemini-1.5-
arXiv Detail & Related papers (2025-08-02T15:39:42Z) - Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions [55.19217798774033]
Memory is a fundamental component of AI systems, underpinning large language models (LLMs)-based agents.<n>In this survey, we first categorize memory representations into parametric and contextual forms.<n>We then introduce six fundamental memory operations: Consolidation, Updating, Indexing, Forgetting, Retrieval, and Compression.
arXiv Detail & Related papers (2025-05-01T17:31:33Z) - Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning [41.94295877935867]
Memory is crucial for enabling agents to tackle complex tasks with temporal and spatial dependencies.<n>Many reinforcement learning algorithms incorporate memory, but the field lacks a universal benchmark to assess an agent's memory capabilities.<n>We introduce MIKASA, a comprehensive benchmark for memory RL, with three key contributions.
arXiv Detail & Related papers (2025-02-14T20:46:19Z) - Embodied-RAG: General Non-parametric Embodied Memory for Retrieval and Generation [69.01029651113386]
Embodied-RAG is a framework that enhances the model of an embodied agent with a non-parametric memory system.<n>At its core, Embodied-RAG's memory is structured as a semantic forest, storing language descriptions at varying levels of detail.<n>We demonstrate that Embodied-RAG effectively bridges RAG to the robotics domain, successfully handling over 250 explanation and navigation queries.
arXiv Detail & Related papers (2024-09-26T21:44:11Z) - End-to-End Egospheric Spatial Memory [32.42361470456194]
We propose a parameter-free module, Egospheric Spatial Memory (ESM), which encodes the memory in an ego-sphere around the agent.
ESM can be trained end-to-end via either imitation or reinforcement learning.
We show applications to semantic segmentation on the ScanNet dataset, where ESM naturally combines image-level and map-level inference modalities.
arXiv Detail & Related papers (2021-02-15T18:59:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.