Navigating to Objects in the Real World
- URL: http://arxiv.org/abs/2212.00922v1
- Date: Fri, 2 Dec 2022 01:10:47 GMT
- Title: Navigating to Objects in the Real World
- Authors: Theophile Gervet, Soumith Chintala, Dhruv Batra, Jitendra Malik,
Devendra Singh Chaplot
- Abstract summary: We present a large-scale empirical study of semantic visual navigation methods comparing methods from classical, modular, and end-to-end learning approaches.
We find that modular learning works well in the real world, attaining a 90% success rate.
In contrast, end-to-end learning does not, dropping from 77% simulation to 23% real-world success rate due to a large image domain gap between simulation and reality.
- Score: 76.1517654037993
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semantic navigation is necessary to deploy mobile robots in uncontrolled
environments like our homes, schools, and hospitals. Many learning-based
approaches have been proposed in response to the lack of semantic understanding
of the classical pipeline for spatial navigation, which builds a geometric map
using depth sensors and plans to reach point goals. Broadly, end-to-end
learning approaches reactively map sensor inputs to actions with deep neural
networks, while modular learning approaches enrich the classical pipeline with
learning-based semantic sensing and exploration. But learned visual navigation
policies have predominantly been evaluated in simulation. How well do different
classes of methods work on a robot? We present a large-scale empirical study of
semantic visual navigation methods comparing representative methods from
classical, modular, and end-to-end learning approaches across six homes with no
prior experience, maps, or instrumentation. We find that modular learning works
well in the real world, attaining a 90% success rate. In contrast, end-to-end
learning does not, dropping from 77% simulation to 23% real-world success rate
due to a large image domain gap between simulation and reality. For
practitioners, we show that modular learning is a reliable approach to navigate
to objects: modularity and abstraction in policy design enable Sim-to-Real
transfer. For researchers, we identify two key issues that prevent today's
simulators from being reliable evaluation benchmarks - (A) a large Sim-to-Real
gap in images and (B) a disconnect between simulation and real-world error
modes - and propose concrete steps forward.
Related papers
- Multi-Object Navigation in real environments using hybrid policies [18.52681391843433]
We introduce a hybrid navigation method, which decomposes the problem into two different skills.
We show the advantages of this approach compared to end-to-end methods both in simulation and a real environment.
arXiv Detail & Related papers (2024-01-24T20:41:25Z) - A Study on Learning Social Robot Navigation with Multimodal Perception [6.052803245103173]
We present a study on learning social robot navigation with multimodal perception using a large-scale real-world dataset.
We compare unimodal and multimodal learning approaches against a set of classical navigation approaches in different social scenarios.
The results show that multimodal learning has a clear advantage over unimodal learning in both dataset and human studies.
arXiv Detail & Related papers (2023-09-22T01:47:47Z) - Learning Navigational Visual Representations with Semantic Map
Supervision [85.91625020847358]
We propose a navigational-specific visual representation learning method by contrasting the agent's egocentric views and semantic maps.
Ego$2$-Map learning transfers the compact and rich information from a map, such as objects, structure and transition, to the agent's egocentric representations for navigation.
arXiv Detail & Related papers (2023-07-23T14:01:05Z) - Practical Imitation Learning in the Real World via Task Consistency Loss [18.827979446629296]
This paper introduces a self-supervised loss that encourages sim and real alignment both at the feature and action-prediction levels.
We achieve 80% success across ten seen and unseen scenes using only 16.2 hours of teleoperated demonstrations in sim and real.
arXiv Detail & Related papers (2022-02-03T21:43:06Z) - Towards Optimal Strategies for Training Self-Driving Perception Models
in Simulation [98.51313127382937]
We focus on the use of labels in the synthetic domain alone.
Our approach introduces both a way to learn neural-invariant representations and a theoretically inspired view on how to sample the data from the simulator.
We showcase our approach on the bird's-eye-view vehicle segmentation task with multi-sensor data.
arXiv Detail & Related papers (2021-11-15T18:37:43Z) - ViNG: Learning Open-World Navigation with Visual Goals [82.84193221280216]
We propose a learning-based navigation system for reaching visually indicated goals.
We show that our system, which we call ViNG, outperforms previously-proposed methods for goal-conditioned reinforcement learning.
We demonstrate ViNG on a number of real-world applications, such as last-mile delivery and warehouse inspection.
arXiv Detail & Related papers (2020-12-17T18:22:32Z) - Visual Navigation Among Humans with Optimal Control as a Supervisor [72.5188978268463]
We propose an approach that combines learning-based perception with model-based optimal control to navigate among humans.
Our approach is enabled by our novel data-generation tool, HumANav.
We demonstrate that the learned navigation policies can anticipate and react to humans without explicitly predicting future human motion.
arXiv Detail & Related papers (2020-03-20T16:13:47Z) - Learning to Move with Affordance Maps [57.198806691838364]
The ability to autonomously explore and navigate a physical space is a fundamental requirement for virtually any mobile autonomous agent.
Traditional SLAM-based approaches for exploration and navigation largely focus on leveraging scene geometry.
We show that learned affordance maps can be used to augment traditional approaches for both exploration and navigation, providing significant improvements in performance.
arXiv Detail & Related papers (2020-01-08T04:05:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.