Sim-to-Real Transfer for Vision-and-Language Navigation
- URL: http://arxiv.org/abs/2011.03807v1
- Date: Sat, 7 Nov 2020 16:49:04 GMT
- Title: Sim-to-Real Transfer for Vision-and-Language Navigation
- Authors: Peter Anderson, Ayush Shrivastava, Joanne Truong, Arjun Majumdar, Devi
Parikh, Dhruv Batra, Stefan Lee
- Abstract summary: We study the problem of releasing a robot in a previously unseen environment, and having it follow unconstrained natural language navigation instructions.
Recent work on the task of Vision-and-Language Navigation (VLN) has achieved significant progress in simulation.
To assess the implications of this work for robotics, we transfer a VLN agent trained in simulation to a physical robot.
- Score: 70.86250473583354
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the challenging problem of releasing a robot in a previously unseen
environment, and having it follow unconstrained natural language navigation
instructions. Recent work on the task of Vision-and-Language Navigation (VLN)
has achieved significant progress in simulation. To assess the implications of
this work for robotics, we transfer a VLN agent trained in simulation to a
physical robot. To bridge the gap between the high-level discrete action space
learned by the VLN agent, and the robot's low-level continuous action space, we
propose a subgoal model to identify nearby waypoints, and use domain
randomization to mitigate visual domain differences. For accurate sim and real
comparisons in parallel environments, we annotate a 325m2 office space with
1.3km of navigation instructions, and create a digitized replica in simulation.
We find that sim-to-real transfer to an environment not seen in training is
successful if an occupancy map and navigation graph can be collected and
annotated in advance (success rate of 46.8% vs. 55.9% in sim), but much more
challenging in the hardest setting with no prior mapping at all (success rate
of 22.5%).
Related papers
- Local Policies Enable Zero-shot Long-horizon Manipulation [80.1161776000682]
We introduce ManipGen, which leverages a new class of policies for sim2real transfer: local policies.
ManipGen outperforms SOTA approaches such as SayCan, OpenVLA, LLMTrajGen and VoxPoser across 50 real-world manipulation tasks by 36%, 76%, 62% and 60% respectively.
arXiv Detail & Related papers (2024-10-29T17:59:55Z) - CANVAS: Commonsense-Aware Navigation System for Intuitive Human-Robot Interaction [19.997935470257794]
We present CANVAS, a framework that combines visual and linguistic instructions for commonsense-aware navigation.
Its success is driven by imitation learning, enabling the robot to learn from human navigation behavior.
Our experiments show that CANVAS outperforms the strong rule-based system ROS NavStack across all environments.
arXiv Detail & Related papers (2024-10-02T06:34:45Z) - Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning [15.792914346054502]
We tackle the challenge of sim-to-real transfer of reinforcement learning (RL) agents for coverage path planning ( CPP)
We bridge the sim-to-real gap through a semi-virtual environment, including a real robot and real-time aspects, while utilizing a simulated sensor and obstacles.
We find that a high inference frequency allows first-order Markovian policies to transfer directly from simulation, while higher-order policies can be fine-tuned to further reduce the sim-to-real gap.
arXiv Detail & Related papers (2024-06-07T13:24:19Z) - Learning to navigate efficiently and precisely in real environments [14.52507964172957]
Embodied AI literature focuses on end-to-end agents trained in simulators like Habitat or AI-Thor.
In this work we explore end-to-end training of agents in simulation in settings which minimize the sim2real gap.
arXiv Detail & Related papers (2024-01-25T17:50:05Z) - Navigating to Objects in the Real World [76.1517654037993]
We present a large-scale empirical study of semantic visual navigation methods comparing methods from classical, modular, and end-to-end learning approaches.
We find that modular learning works well in the real world, attaining a 90% success rate.
In contrast, end-to-end learning does not, dropping from 77% simulation to 23% real-world success rate due to a large image domain gap between simulation and reality.
arXiv Detail & Related papers (2022-12-02T01:10:47Z) - Sim-to-Real via Sim-to-Seg: End-to-end Off-road Autonomous Driving
Without Real Data [56.49494318285391]
We present Sim2Seg, a re-imagining of RCAN that crosses the visual reality gap for off-road autonomous driving.
This is done by learning to translate randomized simulation images into simulated segmentation and depth maps.
This allows us to train an end-to-end RL policy in simulation, and directly deploy in the real-world.
arXiv Detail & Related papers (2022-10-25T17:50:36Z) - Out of the Box: Embodied Navigation in the Real World [45.97756658635314]
We show how to transfer knowledge acquired in simulation into the real world.
We deploy our models on a LoCoBot equipped with a single Intel RealSense camera.
Our experiments indicate that it is possible to achieve satisfying results when deploying the obtained model in the real world.
arXiv Detail & Related papers (2021-05-12T18:00:14Z) - Reactive Long Horizon Task Execution via Visual Skill and Precondition
Models [59.76233967614774]
We describe an approach for sim-to-real training that can accomplish unseen robotic tasks using models learned in simulation to ground components of a simple task planner.
We show an increase in success rate from 91.6% to 98% in simulation and from 10% to 80% success rate in the real-world as compared with naive baselines.
arXiv Detail & Related papers (2020-11-17T15:24:01Z) - Point Cloud Based Reinforcement Learning for Sim-to-Real and Partial
Observability in Visual Navigation [62.22058066456076]
Reinforcement Learning (RL) represents powerful tools to solve complex robotic tasks.
RL does not work directly in the real-world, which is known as the sim-to-real transfer problem.
We propose a method that learns on an observation space constructed by point clouds and environment randomization.
arXiv Detail & Related papers (2020-07-27T17:46:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.