Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation
- URL: http://arxiv.org/abs/2007.09841v1
- Date: Mon, 20 Jul 2020 02:19:26 GMT
- Title: Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation
- Authors: Medhini Narasimhan, Erik Wijmans, Xinlei Chen, Trevor Darrell, Dhruv
Batra, Devi Parikh, Amanpreet Singh
- Abstract summary: We introduce a learning-based approach for room navigation using semantic maps.
We train a model to generate amodal semantic top-down maps indicating beliefs of location, size, and shape of rooms.
Next, we use these maps to predict a point that lies in the target room and train a policy to navigate to the point.
- Score: 143.6144560164782
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a learning-based approach for room navigation using semantic
maps. Our proposed architecture learns to predict top-down belief maps of
regions that lie beyond the agent's field of view while modeling architectural
and stylistic regularities in houses. First, we train a model to generate
amodal semantic top-down maps indicating beliefs of location, size, and shape
of rooms by learning the underlying architectural patterns in houses. Next, we
use these maps to predict a point that lies in the target room and train a
policy to navigate to the point. We empirically demonstrate that by predicting
semantic maps, the model learns common correlations found in houses and
generalizes to novel environments. We also demonstrate that reducing the task
of room navigation to point navigation improves the performance further.
Related papers
- Interactive Semantic Map Representation for Skill-based Visual Object
Navigation [43.71312386938849]
This paper introduces a new representation of a scene semantic map formed during the embodied agent interaction with the indoor environment.
We have implemented this representation into a full-fledged navigation approach called SkillTron.
The proposed approach makes it possible to form both intermediate goals for robot exploration and the final goal for object navigation.
arXiv Detail & Related papers (2023-11-07T16:30:12Z) - Object Goal Navigation with Recursive Implicit Maps [92.6347010295396]
We propose an implicit spatial map for object goal navigation.
Our method significantly outperforms the state of the art on the challenging MP3D dataset.
We deploy our model on a real robot and achieve encouraging object goal navigation results in real scenes.
arXiv Detail & Related papers (2023-08-10T14:21:33Z) - Learning Navigational Visual Representations with Semantic Map
Supervision [85.91625020847358]
We propose a navigational-specific visual representation learning method by contrasting the agent's egocentric views and semantic maps.
Ego$2$-Map learning transfers the compact and rich information from a map, such as objects, structure and transition, to the agent's egocentric representations for navigation.
arXiv Detail & Related papers (2023-07-23T14:01:05Z) - How To Not Train Your Dragon: Training-free Embodied Object Goal
Navigation with Semantic Frontiers [94.46825166907831]
We present a training-free solution to tackle the object goal navigation problem in Embodied AI.
Our method builds a structured scene representation based on the classic visual simultaneous localization and mapping (V-SLAM) framework.
Our method propagates semantics on the scene graphs based on language priors and scene statistics to introduce semantic knowledge to the geometric frontiers.
arXiv Detail & Related papers (2023-05-26T13:38:33Z) - Multi-Object Navigation with dynamically learned neural implicit
representations [10.182418917501064]
We propose to structure neural networks with two neural implicit representations, which are learned dynamically during each episode.
We evaluate the agent on Multi-Object Navigation and show the high impact of using neural implicit representations as a memory source.
arXiv Detail & Related papers (2022-10-11T04:06:34Z) - Occupancy Anticipation for Efficient Exploration and Navigation [97.17517060585875]
We propose occupancy anticipation, where the agent uses its egocentric RGB-D observations to infer the occupancy state beyond the visible regions.
By exploiting context in both the egocentric views and top-down maps our model successfully anticipates a broader map of the environment.
Our approach is the winning entry in the 2020 Habitat PointNav Challenge.
arXiv Detail & Related papers (2020-08-21T03:16:51Z) - Learning to Move with Affordance Maps [57.198806691838364]
The ability to autonomously explore and navigate a physical space is a fundamental requirement for virtually any mobile autonomous agent.
Traditional SLAM-based approaches for exploration and navigation largely focus on leveraging scene geometry.
We show that learned affordance maps can be used to augment traditional approaches for both exploration and navigation, providing significant improvements in performance.
arXiv Detail & Related papers (2020-01-08T04:05:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.