Out of the Box: Embodied Navigation in the Real World
- URL: http://arxiv.org/abs/2105.05873v1
- Date: Wed, 12 May 2021 18:00:14 GMT
- Title: Out of the Box: Embodied Navigation in the Real World
- Authors: Roberto Bigazzi, Federico Landi, Marcella Cornia, Silvia Cascianelli,
Lorenzo Baraldi and Rita Cucchiara
- Abstract summary: We show how to transfer knowledge acquired in simulation into the real world.
We deploy our models on a LoCoBot equipped with a single Intel RealSense camera.
Our experiments indicate that it is possible to achieve satisfying results when deploying the obtained model in the real world.
- Score: 45.97756658635314
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: The research field of Embodied AI has witnessed substantial progress in
visual navigation and exploration thanks to powerful simulating platforms and
the availability of 3D data of indoor and photorealistic environments. These
two factors have opened the doors to a new generation of intelligent agents
capable of achieving nearly perfect PointGoal Navigation. However, such
architectures are commonly trained with millions, if not billions, of frames
and tested in simulation. Together with great enthusiasm, these results yield a
question: how many researchers will effectively benefit from these advances? In
this work, we detail how to transfer the knowledge acquired in simulation into
the real world. To that end, we describe the architectural discrepancies that
damage the Sim2Real adaptation ability of models trained on the Habitat
simulator and propose a novel solution tailored towards the deployment in
real-world scenarios. We then deploy our models on a LoCoBot, a Low-Cost Robot
equipped with a single Intel RealSense camera. Different from previous work,
our testing scene is unavailable to the agent in simulation. The environment is
also inaccessible to the agent beforehand, so it cannot count on scene-specific
semantic priors. In this way, we reproduce a setting in which a research group
(potentially from other fields) needs to employ the agent visual navigation
capabilities as-a-Service. Our experiments indicate that it is possible to
achieve satisfying results when deploying the obtained model in the real world.
Our code and models are available at https://github.com/aimagelab/LoCoNav.
Related papers
- EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment [38.14321677323052]
Embodied artificial intelligence emphasizes the role of an agent's body in generating human-like behaviors.
In this paper, we construct a benchmark platform for embodied intelligence evaluation in real-world city environments.
arXiv Detail & Related papers (2024-10-12T17:49:26Z) - Learning Interactive Real-World Simulators [96.5991333400566]
We explore the possibility of learning a universal simulator of real-world interaction through generative modeling.
We use the simulator to train both high-level vision-language policies and low-level reinforcement learning policies.
Video captioning models can benefit from training with simulated experience, opening up even wider applications.
arXiv Detail & Related papers (2023-10-09T19:42:22Z) - Self-Supervised Object Goal Navigation with In-Situ Finetuning [110.6053241629366]
This work builds an agent that builds self-supervised models of the world via exploration.
We identify a strong source of self-supervision that can train all components of an ObjectNav agent.
We show that our agent can perform competitively in the real world and simulation.
arXiv Detail & Related papers (2022-12-09T03:41:40Z) - Sim-to-Real via Sim-to-Seg: End-to-end Off-road Autonomous Driving
Without Real Data [56.49494318285391]
We present Sim2Seg, a re-imagining of RCAN that crosses the visual reality gap for off-road autonomous driving.
This is done by learning to translate randomized simulation images into simulated segmentation and depth maps.
This allows us to train an end-to-end RL policy in simulation, and directly deploy in the real-world.
arXiv Detail & Related papers (2022-10-25T17:50:36Z) - NavDreams: Towards Camera-Only RL Navigation Among Humans [35.57943738219839]
We investigate whether the world model concept, which has shown results for modeling and learning policies in Atari games, can also be applied to the camera-based navigation problem.
We create simulated environments where a robot must navigate past static and moving humans without colliding in order to reach its goal.
We find that state-of-the-art methods are able to achieve success in solving the navigation problem, and can generate dream-like predictions of future image-sequences.
arXiv Detail & Related papers (2022-03-23T09:46:44Z) - Image-based Navigation in Real-World Environments via Multiple Mid-level
Representations: Fusion Models, Benchmark and Efficient Evaluation [13.207579081178716]
In recent learning-based navigation approaches, the scene understanding and navigation abilities of the agent are achieved simultaneously.
Unfortunately, even if simulators represent an efficient tool to train navigation policies, the resulting models often fail when transferred into the real world.
One possible solution is to provide the navigation model with mid-level visual representations containing important domain-invariant properties of the scene.
arXiv Detail & Related papers (2022-02-02T15:00:44Z) - Sim-to-Real Transfer for Vision-and-Language Navigation [70.86250473583354]
We study the problem of releasing a robot in a previously unseen environment, and having it follow unconstrained natural language navigation instructions.
Recent work on the task of Vision-and-Language Navigation (VLN) has achieved significant progress in simulation.
To assess the implications of this work for robotics, we transfer a VLN agent trained in simulation to a physical robot.
arXiv Detail & Related papers (2020-11-07T16:49:04Z) - Visual Navigation Among Humans with Optimal Control as a Supervisor [72.5188978268463]
We propose an approach that combines learning-based perception with model-based optimal control to navigate among humans.
Our approach is enabled by our novel data-generation tool, HumANav.
We demonstrate that the learned navigation policies can anticipate and react to humans without explicitly predicting future human motion.
arXiv Detail & Related papers (2020-03-20T16:13:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.