Dynamic Objects Relocalization in Changing Environments with Flow Matching
- URL: http://arxiv.org/abs/2509.16398v1
- Date: Fri, 19 Sep 2025 20:21:16 GMT
- Title: Dynamic Objects Relocalization in Changing Environments with Flow Matching
- Authors: Francesco Argenziano, Miguel Saavedra-Ruiz, Sacha Morin, Daniele Nardi, Liam Paull,
- Abstract summary: FlowMaps is a model based on Flow Matching that is able to infer multimodal object locations over space and time.<n>Our results present statistical evidence to support our hypotheses, opening the way to more complex applications of our approach.
- Score: 9.997314076120956
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Task and motion planning are long-standing challenges in robotics, especially when robots have to deal with dynamic environments exhibiting long-term dynamics, such as households or warehouses. In these environments, long-term dynamics mostly stem from human activities, since previously detected objects can be moved or removed from the scene. This adds the necessity to find such objects again before completing the designed task, increasing the risk of failure due to missed relocalizations. However, in these settings, the nature of such human-object interactions is often overlooked, despite being governed by common habits and repetitive patterns. Our conjecture is that these cues can be exploited to recover the most likely objects' positions in the scene, helping to address the problem of unknown relocalization in changing environments. To this end we propose FlowMaps, a model based on Flow Matching that is able to infer multimodal object locations over space and time. Our results present statistical evidence to support our hypotheses, opening the way to more complex applications of our approach. The code is publically available at https://github.com/Fra-Tsuna/flowmaps
Related papers
- To Move or Not to Move: Constraint-based Planning Enables Zero-Shot Generalization for Interactive Navigation [14.745622942938532]
In real-world scenarios, such as home environments and warehouses, clutter can block all routes.<n>We introduce the Lifelong Interactive Navigation problem, where a mobile robot can move clutter to forge its own path.<n>We propose an LLM-driven, constraint-based planning framework with active perception.
arXiv Detail & Related papers (2026-02-23T17:10:00Z) - Nav-$R^2$ Dual-Relation Reasoning for Generalizable Open-Vocabulary Object-Goal Navigation [67.68165784193556]
Nav-$R2$ is a framework that explicitly models two types of relationships, target-environment modeling and environment-action planning.<n>Our SA-Mem preserves the most target-relevant and current observation-relevant features from both temporal and semantic perspectives.<n>Nav-R2 achieves state-of-the-art performance in localizing unseen objects through a streamlined and efficient pipeline.
arXiv Detail & Related papers (2025-12-02T04:21:02Z) - Rethinking Progression of Memory State in Robotic Manipulation: An Object-Centric Perspective [16.541717037293278]
We introduce LIBERO-Mem, a non-Markovian task suite for stress-testing robotic manipulation under object-level partial observability.<n>It combines short- and long-horizon object tracking with temporally sequenced subgoals, requiring reasoning beyond the current frame.<n>We propose Embodied-SlotSSM, a slot-centric VLA framework built for temporal scalability.
arXiv Detail & Related papers (2025-11-14T16:56:01Z) - LookOut: Real-World Humanoid Egocentric Navigation [61.14016011125957]
We introduce the challenging problem of predicting a sequence of future 6D head poses from an egocentric video.<n>To solve this task, we propose a framework that reasons over temporally aggregated 3D latent features.<n>Motivated by the lack of training data in this space, we present a dataset collected through this approach.
arXiv Detail & Related papers (2025-08-20T06:43:36Z) - BYE: Build Your Encoder with One Sequence of Exploration Data for Long-Term Dynamic Scene Understanding [18.991160292960277]
BYE is a class-agnostic, per-scene point cloud encoder that removes the need for predefined categories, shape priors, or extensive association datasets.<n>We propose an ensembling scheme combining the semantic strengths of Vision Language Models with the scene-specific expertise of BYE, achieving a 7% improvement and a 95% success rate in object association tasks.
arXiv Detail & Related papers (2024-12-03T13:34:42Z) - DeTra: A Unified Model for Object Detection and Trajectory Forecasting [68.85128937305697]
Our approach formulates the union of the two tasks as a trajectory refinement problem.
To tackle this unified task, we design a refinement transformer that infers the presence, pose, and multi-modal future behaviors of objects.
In our experiments, we observe that ourmodel outperforms the state-of-the-art on Argoverse 2 Sensor and Open dataset.
arXiv Detail & Related papers (2024-06-06T18:12:04Z) - Outlier-Robust Long-Term Robotic Mapping Leveraging Ground Segmentation [1.7948767405202701]
I propose a robust long-term robotic mapping system that can work out of the box.
I propose (i) fast and robust ground segmentation to reject the presence of outliers.
I propose (ii)-robust registration with ground segmentation that encompasses the presence of gross outliers.
arXiv Detail & Related papers (2024-05-18T04:56:15Z) - Right Place, Right Time! Dynamizing Topological Graphs for Embodied Navigation [55.581423861790945]
Embodied Navigation tasks often involve constructing topological graphs of a scene during exploration.<n>We introduce structured object transitions to dynamize static topological graphs called Object Transition Graphs (OTGs)<n>OTGs simulate portable targets following structured routes inspired by human habits.
arXiv Detail & Related papers (2024-03-14T22:33:22Z) - HAZARD Challenge: Embodied Decision Making in Dynamically Changing
Environments [93.94020724735199]
HAZARD consists of three unexpected disaster scenarios, including fire, flood, and wind.
This benchmark enables us to evaluate autonomous agents' decision-making capabilities across various pipelines.
arXiv Detail & Related papers (2024-01-23T18:59:43Z) - STOW: Discrete-Frame Segmentation and Tracking of Unseen Objects for
Warehouse Picking Robots [41.017649190833076]
We propose a novel paradigm for joint segmentation and tracking in discrete frames along with a transformer module.
The experiments we conduct show that our approach significantly outperforms recent methods.
arXiv Detail & Related papers (2023-11-04T06:52:38Z) - Robot Active Neural Sensing and Planning in Unknown Cluttered
Environments [0.0]
Active sensing and planning in unknown, cluttered environments is an open challenge for robots intending to provide home service, search and rescue, narrow-passage inspection, and medical assistance.
We present the active neural sensing approach that generates the kinematically feasible viewpoint sequences for the robot manipulator with an in-hand camera to gather the minimum number of observations needed to reconstruct the underlying environment.
Our framework actively collects the visual RGBD observations, aggregates them into scene representation, and performs object shape inference to avoid unnecessary robot interactions with the environment.
arXiv Detail & Related papers (2022-08-23T16:56:54Z) - Discovering Objects that Can Move [55.743225595012966]
We study the problem of object discovery -- separating objects from the background without manual labels.
Existing approaches utilize appearance cues, such as color, texture, and location, to group pixels into object-like regions.
We choose to focus on dynamic objects -- entities that can move independently in the world.
arXiv Detail & Related papers (2022-03-18T21:13:56Z) - TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and
Reconstruction [57.1209039399599]
We propose a map representation that allows maintaining a single volume for the entire scene and all the objects therein.
In a multiple dynamic object tracking and reconstruction scenario, our representation allows maintaining accurate reconstruction of surfaces even while they become temporarily occluded by other objects moving in their proximity.
We evaluate the proposed TSDF++ formulation on a public synthetic dataset and demonstrate its ability to preserve reconstructions of occluded surfaces when compared to the standard TSDF map representation.
arXiv Detail & Related papers (2021-05-16T16:15:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.