Object Goal Navigation with Recursive Implicit Maps
- URL: http://arxiv.org/abs/2308.05602v1
- Date: Thu, 10 Aug 2023 14:21:33 GMT
- Title: Object Goal Navigation with Recursive Implicit Maps
- Authors: Shizhe Chen, Thomas Chabal, Ivan Laptev and Cordelia Schmid
- Abstract summary: We propose an implicit spatial map for object goal navigation.
Our method significantly outperforms the state of the art on the challenging MP3D dataset.
We deploy our model on a real robot and achieve encouraging object goal navigation results in real scenes.
- Score: 92.6347010295396
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object goal navigation aims to navigate an agent to locations of a given
object category in unseen environments. Classical methods explicitly build maps
of environments and require extensive engineering while lacking semantic
information for object-oriented exploration. On the other hand, end-to-end
learning methods alleviate manual map design and predict actions using implicit
representations. Such methods, however, lack an explicit notion of geometry and
may have limited ability to encode navigation history. In this work, we propose
an implicit spatial map for object goal navigation. Our implicit map is
recursively updated with new observations at each step using a transformer. To
encourage spatial reasoning, we introduce auxiliary tasks and train our model
to reconstruct explicit maps as well as to predict visual features, semantic
labels and actions. Our method significantly outperforms the state of the art
on the challenging MP3D dataset and generalizes well to the HM3D dataset. We
successfully deploy our model on a real robot and achieve encouraging object
goal navigation results in real scenes using only a few real-world
demonstrations. Code, trained models and videos are available at
\url{https://www.di.ens.fr/willow/research/onav_rim/}.
Related papers
- One Map to Find Them All: Real-time Open-Vocabulary Mapping for Zero-shot Multi-Object Navigation [2.022249798290507]
We introduce a new benchmark for zero-shot multi-object navigation.
We build a reusable open-vocabulary feature map tailored for real-time object search.
We demonstrate that it outperforms existing state-of-the-art approaches both on single and multi-object navigation tasks.
arXiv Detail & Related papers (2024-09-18T07:44:08Z) - Interactive Semantic Map Representation for Skill-based Visual Object
Navigation [43.71312386938849]
This paper introduces a new representation of a scene semantic map formed during the embodied agent interaction with the indoor environment.
We have implemented this representation into a full-fledged navigation approach called SkillTron.
The proposed approach makes it possible to form both intermediate goals for robot exploration and the final goal for object navigation.
arXiv Detail & Related papers (2023-11-07T16:30:12Z) - How To Not Train Your Dragon: Training-free Embodied Object Goal
Navigation with Semantic Frontiers [94.46825166907831]
We present a training-free solution to tackle the object goal navigation problem in Embodied AI.
Our method builds a structured scene representation based on the classic visual simultaneous localization and mapping (V-SLAM) framework.
Our method propagates semantics on the scene graphs based on language priors and scene statistics to introduce semantic knowledge to the geometric frontiers.
arXiv Detail & Related papers (2023-05-26T13:38:33Z) - PEANUT: Predicting and Navigating to Unseen Targets [18.87376347895365]
Efficient ObjectGoal navigation (ObjectNav) in novel environments requires an understanding of the spatial and semantic regularities in environment layouts.
We present a method for learning these regularities by predicting the locations of unobserved objects from incomplete semantic maps.
Our prediction model is lightweight and can be trained in a supervised manner using a relatively small amount of passively collected data.
arXiv Detail & Related papers (2022-12-05T18:58:58Z) - Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language
Navigation [87.52136927091712]
We address a practical yet challenging problem of training robot agents to navigate in an environment following a path described by some language instructions.
To achieve accurate and efficient navigation, it is critical to build a map that accurately represents both spatial location and the semantic information of the environment objects.
We propose a multi-granularity map, which contains both object fine-grained details (e.g., color, texture) and semantic classes, to represent objects more comprehensively.
arXiv Detail & Related papers (2022-10-14T04:23:27Z) - ViKiNG: Vision-Based Kilometer-Scale Navigation with Geographic Hints [94.60414567852536]
Long-range navigation requires both planning and reasoning about local traversability.
We propose a learning-based approach that integrates learning and planning.
ViKiNG can leverage its image-based learned controller and goal-directed to navigate to goals up to 3 kilometers away.
arXiv Detail & Related papers (2022-02-23T02:14:23Z) - PONI: Potential Functions for ObjectGoal Navigation with
Interaction-free Learning [125.22462763376993]
We propose Potential functions for ObjectGoal Navigation with Interaction-free learning (PONI)
PONI disentangles the skills of where to look?' for an object and how to navigate to (x, y)?'
arXiv Detail & Related papers (2022-01-25T01:07:32Z) - Object Goal Navigation using Goal-Oriented Semantic Exploration [98.14078233526476]
This work studies the problem of object goal navigation which involves navigating to an instance of the given object category in unseen environments.
We propose a modular system called, Goal-Oriented Semantic Exploration' which builds an episodic semantic map and uses it to explore the environment efficiently.
arXiv Detail & Related papers (2020-07-01T17:52:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.