Related papers: Sim-to-Real Transfer for Vision-and-Language Navigation

Sim-to-Real Transfer for Vision-and-Language Navigation

URL: http://arxiv.org/abs/2011.03807v1
Date: Sat, 7 Nov 2020 16:49:04 GMT
Title: Sim-to-Real Transfer for Vision-and-Language Navigation
Authors: Peter Anderson, Ayush Shrivastava, Joanne Truong, Arjun Majumdar, Devi Parikh, Dhruv Batra, Stefan Lee
Abstract summary: We study the problem of releasing a robot in a previously unseen environment, and having it follow unconstrained natural language navigation instructions. Recent work on the task of Vision-and-Language Navigation (VLN) has achieved significant progress in simulation. To assess the implications of this work for robotics, we transfer a VLN agent trained in simulation to a physical robot.
Score: 70.86250473583354
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the challenging problem of releasing a robot in a previously unseen environment, and having it follow unconstrained natural language navigation instructions. Recent work on the task of Vision-and-Language Navigation (VLN) has achieved significant progress in simulation. To assess the implications of this work for robotics, we transfer a VLN agent trained in simulation to a physical robot. To bridge the gap between the high-level discrete action space learned by the VLN agent, and the robot's low-level continuous action space, we propose a subgoal model to identify nearby waypoints, and use domain randomization to mitigate visual domain differences. For accurate sim and real comparisons in parallel environments, we annotate a 325m2 office space with 1.3km of navigation instructions, and create a digitized replica in simulation. We find that sim-to-real transfer to an environment not seen in training is successful if an occupancy map and navigation graph can be collected and annotated in advance (success rate of 46.8% vs. 55.9% in sim), but much more challenging in the hardest setting with no prior mapping at all (success rate of 22.5%).

Related papers

Vid2Sim: Realistic and Interactive Simulation from Video for Urban Navigation [62.5805866419814]
Vid2Sim is a novel framework that bridges the sim2real gap through a scalable and cost-efficient real2sim pipeline for neural 3D scene reconstruction and simulation. Experiments demonstrate that Vid2Sim significantly improves the performance of urban navigation in the digital twins and real world by 31.2% and 68.3% in success rate.
arXiv Detail & Related papers (2025-01-12T03:01:15Z)
Local Policies Enable Zero-shot Long-horizon Manipulation [80.1161776000682]
We introduce ManipGen, which leverages a new class of policies for sim2real transfer: local policies. ManipGen outperforms SOTA approaches such as SayCan, OpenVLA, LLMTrajGen and VoxPoser across 50 real-world manipulation tasks by 36%, 76%, 62% and 60% respectively.
arXiv Detail & Related papers (2024-10-29T17:59:55Z)
CANVAS: Commonsense-Aware Navigation System for Intuitive Human-Robot Interaction [19.997935470257794]
We present CANVAS, a framework that combines visual and linguistic instructions for commonsense-aware navigation. Its success is driven by imitation learning, enabling the robot to learn from human navigation behavior. Our experiments show that CANVAS outperforms the strong rule-based system ROS NavStack across all environments.
arXiv Detail & Related papers (2024-10-02T06:34:45Z)
Sim-to-Real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning [15.792914346054502]
We tackle the challenge of sim-to-real transfer of reinforcement learning (RL) agents for coverage path planning ( CPP) We bridge the sim-to-real gap through a semi-virtual environment, including a real robot and real-time aspects, while utilizing a simulated sensor and obstacles. We find that a high inference frequency allows first-order Markovian policies to transfer directly from simulation, while higher-order policies can be fine-tuned to further reduce the sim-to-real gap.
arXiv Detail & Related papers (2024-06-07T13:24:19Z)
Learning to navigate efficiently and precisely in real environments [14.52507964172957]
Embodied AI literature focuses on end-to-end agents trained in simulators like Habitat or AI-Thor. In this work we explore end-to-end training of agents in simulation in settings which minimize the sim2real gap.
arXiv Detail & Related papers (2024-01-25T17:50:05Z)
Navigating to Objects in the Real World [76.1517654037993]
We present a large-scale empirical study of semantic visual navigation methods comparing methods from classical, modular, and end-to-end learning approaches. We find that modular learning works well in the real world, attaining a 90% success rate. In contrast, end-to-end learning does not, dropping from 77% simulation to 23% real-world success rate due to a large image domain gap between simulation and reality.
arXiv Detail & Related papers (2022-12-02T01:10:47Z)
Sim-to-Real via Sim-to-Seg: End-to-end Off-road Autonomous Driving Without Real Data [56.49494318285391]
We present Sim2Seg, a re-imagining of RCAN that crosses the visual reality gap for off-road autonomous driving. This is done by learning to translate randomized simulation images into simulated segmentation and depth maps. This allows us to train an end-to-end RL policy in simulation, and directly deploy in the real-world.
arXiv Detail & Related papers (2022-10-25T17:50:36Z)
Out of the Box: Embodied Navigation in the Real World [45.97756658635314]
We show how to transfer knowledge acquired in simulation into the real world. We deploy our models on a LoCoBot equipped with a single Intel RealSense camera. Our experiments indicate that it is possible to achieve satisfying results when deploying the obtained model in the real world.
arXiv Detail & Related papers (2021-05-12T18:00:14Z)
Reactive Long Horizon Task Execution via Visual Skill and Precondition Models [59.76233967614774]
We describe an approach for sim-to-real training that can accomplish unseen robotic tasks using models learned in simulation to ground components of a simple task planner. We show an increase in success rate from 91.6% to 98% in simulation and from 10% to 80% success rate in the real-world as compared with naive baselines.
arXiv Detail & Related papers (2020-11-17T15:24:01Z)
Point Cloud Based Reinforcement Learning for Sim-to-Real and Partial Observability in Visual Navigation [62.22058066456076]
Reinforcement Learning (RL) represents powerful tools to solve complex robotic tasks. RL does not work directly in the real-world, which is known as the sim-to-real transfer problem. We propose a method that learns on an observation space constructed by point clouds and environment randomization.
arXiv Detail & Related papers (2020-07-27T17:46:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.