Navigating to Objects Specified by Images
- URL: http://arxiv.org/abs/2304.01192v1
- Date: Mon, 3 Apr 2023 17:58:00 GMT
- Title: Navigating to Objects Specified by Images
- Authors: Jacob Krantz, Theophile Gervet, Karmesh Yadav, Austin Wang, Chris
Paxton, Roozbeh Mottaghi, Dhruv Batra, Jitendra Malik, Stefan Lee, Devendra
Singh Chaplot
- Abstract summary: We present a system that can perform the task in both simulation and the real world.
Our modular method solves sub-tasks of exploration, goal instance re-identification, goal localization, and local navigation.
On the HM3D InstanceImageNav benchmark, this system outperforms a baseline end-to-end RL policy 7x and a state-of-the-art ImageNav model 2.3x.
- Score: 86.9672766351891
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Images are a convenient way to specify which particular object instance an
embodied agent should navigate to. Solving this task requires semantic visual
reasoning and exploration of unknown environments. We present a system that can
perform this task in both simulation and the real world. Our modular method
solves sub-tasks of exploration, goal instance re-identification, goal
localization, and local navigation. We re-identify the goal instance in
egocentric vision using feature-matching and localize the goal instance by
projecting matched features to a map. Each sub-task is solved using
off-the-shelf components requiring zero fine-tuning. On the HM3D
InstanceImageNav benchmark, this system outperforms a baseline end-to-end RL
policy 7x and a state-of-the-art ImageNav model 2.3x (56% vs 25% success). We
deploy this system to a mobile robot platform and demonstrate effective
real-world performance, achieving an 88% success rate across a home and an
office environment.
Related papers
- Real-world Instance-specific Image Goal Navigation: Bridging Domain Gaps via Contrastive Learning [5.904490311837063]
The challenge lies in the domain gap between low-quality images observed by the moving robot and high-quality query images provided by the user.
Few-shot Cross-quality Instance-bluraware Adaptation (CrossIA) employs contrastive learning with an instance to align features between massive low- and few high-quality images.
Our method improves the task success rate by up to three times compared to the baseline.
arXiv Detail & Related papers (2024-04-15T10:24:32Z) - Instance-aware Exploration-Verification-Exploitation for Instance ImageGoal Navigation [88.84058353659107]
Instance ImageGoal Navigation (IIN) aims to navigate to a specified object depicted by a goal image in an unexplored environment.
We propose a new modular navigation framework named Instance-aware Exploration-Verification-Exploitation (IEVE) for instance-level image goal navigation.
Our method surpasses previous state-of-the-art work, with a classical segmentation model (0.684 vs. 0.561 success) or a robust model (0.702 vs. 0.561 success)
arXiv Detail & Related papers (2024-02-25T07:59:10Z) - Zero-Shot Object Goal Visual Navigation With Class-Independent Relationship Network [3.0820097046465285]
"Zero-shot" means that the target the agent needs to find is not trained during the training phase.
We propose the Class-Independent Relationship Network (CIRN) to address the issue of coupling navigation ability with target features during training.
Our method outperforms the current state-of-the-art approaches in the zero-shot object goal visual navigation task.
arXiv Detail & Related papers (2023-10-15T16:42:14Z) - Object Goal Navigation with Recursive Implicit Maps [92.6347010295396]
We propose an implicit spatial map for object goal navigation.
Our method significantly outperforms the state of the art on the challenging MP3D dataset.
We deploy our model on a real robot and achieve encouraging object goal navigation results in real scenes.
arXiv Detail & Related papers (2023-08-10T14:21:33Z) - Instance-Specific Image Goal Navigation: Training Embodied Agents to
Find Object Instances [90.61897965658183]
We consider the problem of embodied visual navigation given an image-goal (ImageNav)
Unlike related navigation tasks, ImageNav does not have a standardized task definition which makes comparison across methods difficult.
We present the Instance-specific ImageNav task (ImageNav) to address these limitations.
arXiv Detail & Related papers (2022-11-29T02:29:35Z) - Navigating to Objects in Unseen Environments by Distance Prediction [16.023495311387478]
We propose an object goal navigation framework, which could directly perform path planning based on an estimated distance map.
Specifically, our model takes a birds-eye-view semantic map as input, and estimates the distance from the map cells to the target object.
With the estimated distance map, the agent could explore the environment and navigate to the target objects based on either human-designed or learned navigation policy.
arXiv Detail & Related papers (2022-02-08T09:22:50Z) - Landmark Policy Optimization for Object Navigation Task [77.34726150561087]
This work studies object goal navigation task, which involves navigating to the closest object related to the given semantic category in unseen environments.
Recent works have shown significant achievements both in the end-to-end Reinforcement Learning approach and modular systems, but need a big step forward to be robust and optimal.
We propose a hierarchical method that incorporates standard task formulation and additional area knowledge as landmarks, with a way to extract these landmarks.
arXiv Detail & Related papers (2021-09-17T12:28:46Z) - Object Goal Navigation using Goal-Oriented Semantic Exploration [98.14078233526476]
This work studies the problem of object goal navigation which involves navigating to an instance of the given object category in unseen environments.
We propose a modular system called, Goal-Oriented Semantic Exploration' which builds an episodic semantic map and uses it to explore the environment efficiently.
arXiv Detail & Related papers (2020-07-01T17:52:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.