Related papers: Right Place, Right Time! Dynamizing Topological Graphs for Embodied Navigation

Right Place, Right Time! Dynamizing Topological Graphs for Embodied Navigation

URL: http://arxiv.org/abs/2403.09905v3
Date: Mon, 10 Mar 2025 22:26:37 GMT
Title: Right Place, Right Time! Dynamizing Topological Graphs for Embodied Navigation
Authors: Vishnu Sashank Dorbala, Bhrij Patel, Amrit Singh Bedi, Dinesh Manocha,
Abstract summary: Embodied Navigation tasks often involve constructing topological graphs of a scene during exploration.<n>We introduce structured object transitions to dynamize static topological graphs called Object Transition Graphs (OTGs)<n>OTGs simulate portable targets following structured routes inspired by human habits.
Score: 55.581423861790945
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Embodied Navigation tasks often involve constructing topological graphs of a scene during exploration to facilitate high-level planning and decision-making for execution in continuous environments. Prior literature makes the assumption of static graphs with stationary targets, which does not hold in many real-world environments with moving objects. To address this, we present a novel formulation generalizing navigation to dynamic environments by introducing structured object transitions to dynamize static topological graphs called Object Transition Graphs (OTGs). OTGs simulate portable targets following structured routes inspired by human habits. We apply this technique to Matterport3D (MP3D), a popular simulator for evaluating embodied tasks. On these dynamized OTGs, we establish a navigation benchmark by evaluating Oracle-based, Reinforcement Learning, and Large Language Model (LLM)-based approaches on a multi-object finding task. Further, we quantify agent adaptability, and make key inferences such as agents employing learned decision-making strategies generalize better than those relying on privileged oracle knowledge. To the best of our knowledge, ours is the first work to introduce structured temporal dynamism on topological graphs for studying generalist embodied navigation policies. The code and dataset for our OTGs will be made publicly available to foster research on embodied navigation in dynamic scenes.

Related papers

SemNav: A Model-Based Planner for Zero-Shot Object Goal Navigation Using Vision-Foundation Models [10.671262416557704]
Vision Foundation Models (VFMs) offer powerful capabilities for visual understanding and reasoning.<n>We present a zero-shot object goal navigation framework that integrates the perceptual strength of VFMs with a model-based planner.<n>We evaluate our approach on the HM3D dataset using the Habitat simulator and demonstrate that our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-06-04T03:04:54Z)
Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments [44.6372390798904]
We propose a new task denominated Personalized Instance-based Navigation (PIN), in which an embodied agent is tasked with locating and reaching a specific personal object. In each episode, the target object is presented to the agent using two modalities: a set of visual reference images on a neutral background and manually annotated textual descriptions.
arXiv Detail & Related papers (2024-10-23T18:01:09Z)
OpenObject-NAV: Open-Vocabulary Object-Oriented Navigation Based on Dynamic Carrier-Relationship Scene Graph [10.475404599532157]
This paper captures the relationships between frequently used objects and their static carriers. We propose an instance navigation strategy that models the navigation process as a Markov Decision Process. The results demonstrate that by updating the CRSG, the robot can efficiently navigate to moved targets.
arXiv Detail & Related papers (2024-09-27T13:33:52Z)
Aligning Knowledge Graph with Visual Perception for Object-goal Navigation [16.32780793344835]
We propose the Aligning Knowledge Graph with Visual Perception (AKGVP) method for object-goal navigation. Our approach introduces continuous modeling of the hierarchical scene architecture and leverages visual-language pre-training to align natural language description with visual perception. The integration of a continuous knowledge graph architecture and multimodal feature alignment empowers the navigator with a remarkable zero-shot navigation capability.
arXiv Detail & Related papers (2024-02-29T06:31:18Z)
NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration [57.15811390835294]
This paper describes how we can train a single unified diffusion policy to handle both goal-directed navigation and goal-agnostic exploration. We show that this unified policy results in better overall performance when navigating to visually indicated goals in novel environments. Our experiments, conducted on a real-world mobile robot platform, show effective navigation in unseen environments in comparison with five alternative methods.
arXiv Detail & Related papers (2023-10-11T21:07:14Z)
Object Goal Navigation with Recursive Implicit Maps [92.6347010295396]
We propose an implicit spatial map for object goal navigation. Our method significantly outperforms the state of the art on the challenging MP3D dataset. We deploy our model on a real robot and achieve encouraging object goal navigation results in real scenes.
arXiv Detail & Related papers (2023-08-10T14:21:33Z)
Task-Driven Graph Attention for Hierarchical Relational Object Navigation [25.571175038938527]
Embodied AI agents in large scenes often need to navigate to find objects. We study a naturally emerging variant of the object navigation task, hierarchical object navigation (HRON) We propose a solution that uses scene graphs as part of its input and integrates graph neural networks as its backbone.
arXiv Detail & Related papers (2023-06-23T19:50:48Z)
How To Not Train Your Dragon: Training-free Embodied Object Goal Navigation with Semantic Frontiers [94.46825166907831]
We present a training-free solution to tackle the object goal navigation problem in Embodied AI. Our method builds a structured scene representation based on the classic visual simultaneous localization and mapping (V-SLAM) framework. Our method propagates semantics on the scene graphs based on language priors and scene statistics to introduce semantic knowledge to the geometric frontiers.
arXiv Detail & Related papers (2023-05-26T13:38:33Z)
Can an Embodied Agent Find Your "Cat-shaped Mug"? LLM-Guided Exploration for Zero-Shot Object Navigation [58.3480730643517]
We present LGX, a novel algorithm for Language-Driven Zero-Shot Object Goal Navigation (L-ZSON) Our approach makes use of Large Language Models (LLMs) for this task. We achieve state-of-the-art zero-shot object navigation results on RoboTHOR with a success rate (SR) improvement of over 27% over the current baseline.
arXiv Detail & Related papers (2023-03-06T20:19:19Z)
A Contextual Bandit Approach for Learning to Plan in Environments with Probabilistic Goal Configurations [20.15854546504947]
We propose a modular framework for object-nav that is able to efficiently search indoor environments for not just static objects but also movable objects. Our contextual-bandit agent efficiently explores the environment by showing optimism in the face of uncertainty. We evaluate our algorithms in two simulated environments and a real-world setting, to demonstrate high sample efficiency and reliability.
arXiv Detail & Related papers (2022-11-29T15:48:54Z)
Object Memory Transformer for Object Goal Navigation [10.359616364592075]
This paper presents a reinforcement learning method for object goal navigation (Nav) An agent navigates in 3D indoor environments to reach a target object based on long-term observations of objects and scenes. To the best of our knowledge, this is the first work that uses a long-term memory of object semantics in a goal-oriented navigation task.
arXiv Detail & Related papers (2022-03-24T09:16:56Z)
Object Manipulation via Visual Target Localization [64.05939029132394]
Training agents to manipulate objects, poses many challenges. We propose an approach that explores the environment in search for target objects, computes their 3D coordinates once they are located, and then continues to estimate their 3D locations even when the objects are not visible. Our evaluations show a massive 3x improvement in success rate over a model that has access to the same sensory suite.
arXiv Detail & Related papers (2022-03-15T17:59:01Z)
Navigating to Objects in Unseen Environments by Distance Prediction [16.023495311387478]
We propose an object goal navigation framework, which could directly perform path planning based on an estimated distance map. Specifically, our model takes a birds-eye-view semantic map as input, and estimates the distance from the map cells to the target object. With the estimated distance map, the agent could explore the environment and navigate to the target objects based on either human-designed or learned navigation policy.
arXiv Detail & Related papers (2022-02-08T09:22:50Z)
SOON: Scenario Oriented Object Navigation with Graph-based Exploration [102.74649829684617]
The ability to navigate like a human towards a language-guided target from anywhere in a 3D embodied environment is one of the 'holy grail' goals of intelligent robots. Most visual navigation benchmarks focus on navigating toward a target from a fixed starting point, guided by an elaborate set of instructions that depicts step-by-step. This approach deviates from real-world problems in which human-only describes what the object and its surrounding look like and asks the robot to start navigation from anywhere.
arXiv Detail & Related papers (2021-03-31T15:01:04Z)
POMP: Pomcp-based Online Motion Planning for active visual search in indoor environments [89.43830036483901]
We focus on the problem of learning an optimal policy for Active Visual Search (AVS) of objects in known indoor environments with an online setup. Our POMP method uses as input the current pose of an agent and a RGB-D frame. We validate our method on the publicly available AVD benchmark, achieving an average success rate of 0.76 with an average path length of 17.1.
arXiv Detail & Related papers (2020-09-17T08:23:50Z)
Object Goal Navigation using Goal-Oriented Semantic Exploration [98.14078233526476]
This work studies the problem of object goal navigation which involves navigating to an instance of the given object category in unseen environments. We propose a modular system called, Goal-Oriented Semantic Exploration' which builds an episodic semantic map and uses it to explore the environment efficiently.
arXiv Detail & Related papers (2020-07-01T17:52:32Z)
ObjectNav Revisited: On Evaluation of Embodied Agents Navigating to Objects [119.46959413000594]
This document summarizes the consensus recommendations of a working group on ObjectNav. We make recommendations on subtle but important details of evaluation criteria. We provide a detailed description of the instantiation of these recommendations in challenges organized at the Embodied AI workshop at CVPR 2020.
arXiv Detail & Related papers (2020-06-23T17:18:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.