How To Not Train Your Dragon: Training-free Embodied Object Goal
Navigation with Semantic Frontiers
- URL: http://arxiv.org/abs/2305.16925v1
- Date: Fri, 26 May 2023 13:38:33 GMT
- Title: How To Not Train Your Dragon: Training-free Embodied Object Goal
Navigation with Semantic Frontiers
- Authors: Junting Chen, Guohao Li, Suryansh Kumar, Bernard Ghanem, Fisher Yu
- Abstract summary: We present a training-free solution to tackle the object goal navigation problem in Embodied AI.
Our method builds a structured scene representation based on the classic visual simultaneous localization and mapping (V-SLAM) framework.
Our method propagates semantics on the scene graphs based on language priors and scene statistics to introduce semantic knowledge to the geometric frontiers.
- Score: 94.46825166907831
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object goal navigation is an important problem in Embodied AI that involves
guiding the agent to navigate to an instance of the object category in an
unknown environment -- typically an indoor scene. Unfortunately, current
state-of-the-art methods for this problem rely heavily on data-driven
approaches, \eg, end-to-end reinforcement learning, imitation learning, and
others. Moreover, such methods are typically costly to train and difficult to
debug, leading to a lack of transferability and explainability. Inspired by
recent successes in combining classical and learning methods, we present a
modular and training-free solution, which embraces more classic approaches, to
tackle the object goal navigation problem. Our method builds a structured scene
representation based on the classic visual simultaneous localization and
mapping (V-SLAM) framework. We then inject semantics into geometric-based
frontier exploration to reason about promising areas to search for a goal
object. Our structured scene representation comprises a 2D occupancy map,
semantic point cloud, and spatial scene graph.
Our method propagates semantics on the scene graphs based on language priors
and scene statistics to introduce semantic knowledge to the geometric
frontiers. With injected semantic priors, the agent can reason about the most
promising frontier to explore. The proposed pipeline shows strong experimental
performance for object goal navigation on the Gibson benchmark dataset,
outperforming the previous state-of-the-art. We also perform comprehensive
ablation studies to identify the current bottleneck in the object navigation
task.
Related papers
- Aligning Knowledge Graph with Visual Perception for Object-goal Navigation [16.32780793344835]
We propose the Aligning Knowledge Graph with Visual Perception (AKGVP) method for object-goal navigation.
Our approach introduces continuous modeling of the hierarchical scene architecture and leverages visual-language pre-training to align natural language description with visual perception.
The integration of a continuous knowledge graph architecture and multimodal feature alignment empowers the navigator with a remarkable zero-shot navigation capability.
arXiv Detail & Related papers (2024-02-29T06:31:18Z) - Interactive Semantic Map Representation for Skill-based Visual Object
Navigation [43.71312386938849]
This paper introduces a new representation of a scene semantic map formed during the embodied agent interaction with the indoor environment.
We have implemented this representation into a full-fledged navigation approach called SkillTron.
The proposed approach makes it possible to form both intermediate goals for robot exploration and the final goal for object navigation.
arXiv Detail & Related papers (2023-11-07T16:30:12Z) - Object Goal Navigation with Recursive Implicit Maps [92.6347010295396]
We propose an implicit spatial map for object goal navigation.
Our method significantly outperforms the state of the art on the challenging MP3D dataset.
We deploy our model on a real robot and achieve encouraging object goal navigation results in real scenes.
arXiv Detail & Related papers (2023-08-10T14:21:33Z) - Learning to Map for Active Semantic Goal Navigation [40.193928212509356]
We propose a novel framework that actively learns to generate semantic maps outside the field of view of the agent.
We show how different objectives can be defined by balancing exploration with exploitation.
Our method is validated in the visually realistic environments offered by the Matterport3D dataset.
arXiv Detail & Related papers (2021-06-29T18:01:30Z) - SOON: Scenario Oriented Object Navigation with Graph-based Exploration [102.74649829684617]
The ability to navigate like a human towards a language-guided target from anywhere in a 3D embodied environment is one of the 'holy grail' goals of intelligent robots.
Most visual navigation benchmarks focus on navigating toward a target from a fixed starting point, guided by an elaborate set of instructions that depicts step-by-step.
This approach deviates from real-world problems in which human-only describes what the object and its surrounding look like and asks the robot to start navigation from anywhere.
arXiv Detail & Related papers (2021-03-31T15:01:04Z) - Object Goal Navigation using Goal-Oriented Semantic Exploration [98.14078233526476]
This work studies the problem of object goal navigation which involves navigating to an instance of the given object category in unseen environments.
We propose a modular system called, Goal-Oriented Semantic Exploration' which builds an episodic semantic map and uses it to explore the environment efficiently.
arXiv Detail & Related papers (2020-07-01T17:52:32Z) - Neural Topological SLAM for Visual Navigation [112.73876869904]
We design topological representations for space that leverage semantics and afford approximate geometric reasoning.
We describe supervised learning-based algorithms that can build, maintain and use such representations under noisy actuation.
arXiv Detail & Related papers (2020-05-25T17:56:29Z) - Learning to Move with Affordance Maps [57.198806691838364]
The ability to autonomously explore and navigate a physical space is a fundamental requirement for virtually any mobile autonomous agent.
Traditional SLAM-based approaches for exploration and navigation largely focus on leveraging scene geometry.
We show that learned affordance maps can be used to augment traditional approaches for both exploration and navigation, providing significant improvements in performance.
arXiv Detail & Related papers (2020-01-08T04:05:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.