Instance-Specific Image Goal Navigation: Training Embodied Agents to
Find Object Instances
- URL: http://arxiv.org/abs/2211.15876v1
- Date: Tue, 29 Nov 2022 02:29:35 GMT
- Title: Instance-Specific Image Goal Navigation: Training Embodied Agents to
Find Object Instances
- Authors: Jacob Krantz, Stefan Lee, Jitendra Malik, Dhruv Batra, Devendra Singh
Chaplot
- Abstract summary: We consider the problem of embodied visual navigation given an image-goal (ImageNav)
Unlike related navigation tasks, ImageNav does not have a standardized task definition which makes comparison across methods difficult.
We present the Instance-specific ImageNav task (ImageNav) to address these limitations.
- Score: 90.61897965658183
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of embodied visual navigation given an image-goal
(ImageNav) where an agent is initialized in an unfamiliar environment and
tasked with navigating to a location 'described' by an image. Unlike related
navigation tasks, ImageNav does not have a standardized task definition which
makes comparison across methods difficult. Further, existing formulations have
two problematic properties; (1) image-goals are sampled from random locations
which can lead to ambiguity (e.g., looking at walls), and (2) image-goals match
the camera specification and embodiment of the agent; this rigidity is limiting
when considering user-driven downstream applications. We present the
Instance-specific ImageNav task (InstanceImageNav) to address these
limitations. Specifically, the goal image is 'focused' on some particular
object instance in the scene and is taken with camera parameters independent of
the agent. We instantiate InstanceImageNav in the Habitat Simulator using
scenes from the Habitat-Matterport3D dataset (HM3D) and release a standardized
benchmark to measure community progress.
Related papers
- Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments [44.6372390798904]
We propose a new task denominated Personalized Instance-based Navigation (PIN), in which an embodied agent is tasked with locating and reaching a specific personal object.
In each episode, the target object is presented to the agent using two modalities: a set of visual reference images on a neutral background and manually annotated textual descriptions.
arXiv Detail & Related papers (2024-10-23T18:01:09Z) - Prioritized Semantic Learning for Zero-shot Instance Navigation [2.537056548731396]
We study zero-shot instance navigation, in which the agent navigates to a specific object without using object annotations for training.
We propose a Prioritized Semantic Learning (PSL) method to improve the semantic understanding ability of navigation agents.
Our PSL agent outperforms the previous state-of-the-art by 66% on zero-shot ObjectNav in terms of success rate and is also superior on the new InstanceNav task.
arXiv Detail & Related papers (2024-03-18T10:45:50Z) - GaussNav: Gaussian Splatting for Visual Navigation [92.13664084464514]
Instance ImageGoal Navigation (IIN) requires an agent to locate a specific object depicted in a goal image within an unexplored environment.
Our framework constructs a novel map representation based on 3D Gaussian Splatting (3DGS)
Our framework demonstrates a significant leap in performance, evidenced by an increase in Success weighted by Path Length (SPL) from 0.252 to 0.578 on the challenging Habitat-Matterport 3D (HM3D) dataset.
arXiv Detail & Related papers (2024-03-18T09:56:48Z) - Instance-aware Exploration-Verification-Exploitation for Instance ImageGoal Navigation [88.84058353659107]
Instance ImageGoal Navigation (IIN) aims to navigate to a specified object depicted by a goal image in an unexplored environment.
We propose a new modular navigation framework named Instance-aware Exploration-Verification-Exploitation (IEVE) for instance-level image goal navigation.
Our method surpasses previous state-of-the-art work, with a classical segmentation model (0.684 vs. 0.561 success) or a robust model (0.702 vs. 0.561 success)
arXiv Detail & Related papers (2024-02-25T07:59:10Z) - Navigating to Objects Specified by Images [86.9672766351891]
We present a system that can perform the task in both simulation and the real world.
Our modular method solves sub-tasks of exploration, goal instance re-identification, goal localization, and local navigation.
On the HM3D InstanceImageNav benchmark, this system outperforms a baseline end-to-end RL policy 7x and a state-of-the-art ImageNav model 2.3x.
arXiv Detail & Related papers (2023-04-03T17:58:00Z) - ObjectNav Revisited: On Evaluation of Embodied Agents Navigating to
Objects [119.46959413000594]
This document summarizes the consensus recommendations of a working group on ObjectNav.
We make recommendations on subtle but important details of evaluation criteria.
We provide a detailed description of the instantiation of these recommendations in challenges organized at the Embodied AI workshop at CVPR 2020.
arXiv Detail & Related papers (2020-06-23T17:18:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.