Sim-to-Real Transfer for Vision-and-Language Navigation
- URL: http://arxiv.org/abs/2011.03807v1
- Date: Sat, 7 Nov 2020 16:49:04 GMT
- Title: Sim-to-Real Transfer for Vision-and-Language Navigation
- Authors: Peter Anderson, Ayush Shrivastava, Joanne Truong, Arjun Majumdar, Devi
Parikh, Dhruv Batra, Stefan Lee
- Abstract summary: We study the problem of releasing a robot in a previously unseen environment, and having it follow unconstrained natural language navigation instructions.
Recent work on the task of Vision-and-Language Navigation (VLN) has achieved significant progress in simulation.
To assess the implications of this work for robotics, we transfer a VLN agent trained in simulation to a physical robot.
- Score: 70.86250473583354
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the challenging problem of releasing a robot in a previously unseen
environment, and having it follow unconstrained natural language navigation
instructions. Recent work on the task of Vision-and-Language Navigation (VLN)
has achieved significant progress in simulation. To assess the implications of
this work for robotics, we transfer a VLN agent trained in simulation to a
physical robot. To bridge the gap between the high-level discrete action space
learned by the VLN agent, and the robot's low-level continuous action space, we
propose a subgoal model to identify nearby waypoints, and use domain
randomization to mitigate visual domain differences. For accurate sim and real
comparisons in parallel environments, we annotate a 325m2 office space with
1.3km of navigation instructions, and create a digitized replica in simulation.
We find that sim-to-real transfer to an environment not seen in training is
successful if an occupancy map and navigation graph can be collected and
annotated in advance (success rate of 46.8% vs. 55.9% in sim), but much more
challenging in the hardest setting with no prior mapping at all (success rate
of 22.5%).
Related papers
- Sim-to-real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning [15.792914346054502]
We tackle the challenge of sim-to-real transfer of reinforcement learning (RL) agents for coverage path planning ( CPP)
We bridge the sim-to-real gap through a semi-virtual environment with a simulated sensor and obstacles, while including real robot kinematics and real-time aspects.
We find that a high model inference frequency is sufficient for reducing the sim-to-real gap, while fine-tuning degrades performance initially.
arXiv Detail & Related papers (2024-06-07T13:24:19Z) - NaVid: Video-based VLM Plans the Next Step for Vision-and-Language Navigation [23.72290930234063]
NaVid is a video-based large vision language model (VLM) for vision-and-language navigation.
NaVid achieves state-of-the-art performance in simulation environments and the real world, demonstrating superior cross-dataset and Sim2Real transfer.
arXiv Detail & Related papers (2024-02-24T16:39:16Z) - Learning to navigate efficiently and precisely in real environments [14.52507964172957]
Embodied AI literature focuses on end-to-end agents trained in simulators like Habitat or AI-Thor.
In this work we explore end-to-end training of agents in simulation in settings which minimize the sim2real gap.
arXiv Detail & Related papers (2024-01-25T17:50:05Z) - Navigating to Objects in the Real World [76.1517654037993]
We present a large-scale empirical study of semantic visual navigation methods comparing methods from classical, modular, and end-to-end learning approaches.
We find that modular learning works well in the real world, attaining a 90% success rate.
In contrast, end-to-end learning does not, dropping from 77% simulation to 23% real-world success rate due to a large image domain gap between simulation and reality.
arXiv Detail & Related papers (2022-12-02T01:10:47Z) - Sim-to-Real via Sim-to-Seg: End-to-end Off-road Autonomous Driving
Without Real Data [56.49494318285391]
We present Sim2Seg, a re-imagining of RCAN that crosses the visual reality gap for off-road autonomous driving.
This is done by learning to translate randomized simulation images into simulated segmentation and depth maps.
This allows us to train an end-to-end RL policy in simulation, and directly deploy in the real-world.
arXiv Detail & Related papers (2022-10-25T17:50:36Z) - ProcTHOR: Large-Scale Embodied AI Using Procedural Generation [55.485985317538194]
ProcTHOR is a framework for procedural generation of Embodied AI environments.
We demonstrate state-of-the-art results across 6 embodied AI benchmarks for navigation, rearrangement, and arm manipulation.
arXiv Detail & Related papers (2022-06-14T17:09:35Z) - Out of the Box: Embodied Navigation in the Real World [45.97756658635314]
We show how to transfer knowledge acquired in simulation into the real world.
We deploy our models on a LoCoBot equipped with a single Intel RealSense camera.
Our experiments indicate that it is possible to achieve satisfying results when deploying the obtained model in the real world.
arXiv Detail & Related papers (2021-05-12T18:00:14Z) - Reactive Long Horizon Task Execution via Visual Skill and Precondition
Models [59.76233967614774]
We describe an approach for sim-to-real training that can accomplish unseen robotic tasks using models learned in simulation to ground components of a simple task planner.
We show an increase in success rate from 91.6% to 98% in simulation and from 10% to 80% success rate in the real-world as compared with naive baselines.
arXiv Detail & Related papers (2020-11-17T15:24:01Z) - Point Cloud Based Reinforcement Learning for Sim-to-Real and Partial
Observability in Visual Navigation [62.22058066456076]
Reinforcement Learning (RL) represents powerful tools to solve complex robotic tasks.
RL does not work directly in the real-world, which is known as the sim-to-real transfer problem.
We propose a method that learns on an observation space constructed by point clouds and environment randomization.
arXiv Detail & Related papers (2020-07-27T17:46:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.