Unsupervised Domain Adaptation for Visual Navigation
- URL: http://arxiv.org/abs/2010.14543v2
- Date: Thu, 12 Nov 2020 17:41:51 GMT
- Title: Unsupervised Domain Adaptation for Visual Navigation
- Authors: Shangda Li, Devendra Singh Chaplot, Yao-Hung Hubert Tsai, Yue Wu,
Louis-Philippe Morency, Ruslan Salakhutdinov
- Abstract summary: We propose an unsupervised domain adaptation method for visual navigation.
Our method translates the images in the target domain to the source domain such that the translation is consistent with the representations learned by the navigation policy.
- Score: 115.85181329193092
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Advances in visual navigation methods have led to intelligent embodied
navigation agents capable of learning meaningful representations from raw RGB
images and perform a wide variety of tasks involving structural and semantic
reasoning. However, most learning-based navigation policies are trained and
tested in simulation environments. In order for these policies to be
practically useful, they need to be transferred to the real-world. In this
paper, we propose an unsupervised domain adaptation method for visual
navigation. Our method translates the images in the target domain to the source
domain such that the translation is consistent with the representations learned
by the navigation policy. The proposed method outperforms several baselines
across two different navigation tasks in simulation. We further show that our
method can be used to transfer the navigation policies learned in simulation to
the real world.
Related papers
- MC-GPT: Empowering Vision-and-Language Navigation with Memory Map and Reasoning Chains [4.941781282578696]
In the Vision-and-Language Navigation (VLN) task, the agent is required to navigate to a destination following a natural language instruction.
While learning-based approaches have been a major solution to the task, they suffer from high training costs and lack of interpretability.
Recently, Large Language Models (LLMs) have emerged as a promising tool for VLN due to their strong generalization capabilities.
arXiv Detail & Related papers (2024-05-17T08:33:27Z) - NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration [57.15811390835294]
This paper describes how we can train a single unified diffusion policy to handle both goal-directed navigation and goal-agnostic exploration.
We show that this unified policy results in better overall performance when navigating to visually indicated goals in novel environments.
Our experiments, conducted on a real-world mobile robot platform, show effective navigation in unseen environments in comparison with five alternative methods.
arXiv Detail & Related papers (2023-10-11T21:07:14Z) - LangNav: Language as a Perceptual Representation for Navigation [63.90602960822604]
We explore the use of language as a perceptual representation for vision-and-language navigation (VLN)
Our approach uses off-the-shelf vision systems for image captioning and object detection to convert an agent's egocentric panoramic view at each time step into natural language descriptions.
arXiv Detail & Related papers (2023-10-11T20:52:30Z) - Learning Navigational Visual Representations with Semantic Map
Supervision [85.91625020847358]
We propose a navigational-specific visual representation learning method by contrasting the agent's egocentric views and semantic maps.
Ego$2$-Map learning transfers the compact and rich information from a map, such as objects, structure and transition, to the agent's egocentric representations for navigation.
arXiv Detail & Related papers (2023-07-23T14:01:05Z) - Virtual Guidance as a Mid-level Representation for Navigation [8.712750753534532]
"Virtual Guidance" is designed to visually represent non-visual instructional signals.
We evaluate our proposed method through experiments in both simulated and real-world settings.
arXiv Detail & Related papers (2023-03-05T17:55:15Z) - Navigating to Objects in the Real World [76.1517654037993]
We present a large-scale empirical study of semantic visual navigation methods comparing methods from classical, modular, and end-to-end learning approaches.
We find that modular learning works well in the real world, attaining a 90% success rate.
In contrast, end-to-end learning does not, dropping from 77% simulation to 23% real-world success rate due to a large image domain gap between simulation and reality.
arXiv Detail & Related papers (2022-12-02T01:10:47Z) - UAS Navigation in the Real World Using Visual Observation [0.4297070083645048]
This paper presents a novel end-to-end Unmanned Aerial System (UAS) navigation approach for long-range visual navigation in the real world.
Our system combines the reinforcement learning (RL) and image matching approaches.
We demonstrate that the UAS can learn navigating to the destination hundreds meters away from the starting point with the shortest path in the real world scenario.
arXiv Detail & Related papers (2022-08-25T14:40:53Z) - ViNG: Learning Open-World Navigation with Visual Goals [82.84193221280216]
We propose a learning-based navigation system for reaching visually indicated goals.
We show that our system, which we call ViNG, outperforms previously-proposed methods for goal-conditioned reinforcement learning.
We demonstrate ViNG on a number of real-world applications, such as last-mile delivery and warehouse inspection.
arXiv Detail & Related papers (2020-12-17T18:22:32Z) - On Embodied Visual Navigation in Real Environments Through Habitat [20.630139085937586]
Visual navigation models based on deep learning can learn effective policies when trained on large amounts of visual observations.
To deal with this limitation, several simulation platforms have been proposed in order to train visual navigation policies on virtual environments efficiently.
We show that our tool can effectively help to train and evaluate navigation policies on real-world observations without running navigation pisodes in the real world.
arXiv Detail & Related papers (2020-10-26T09:19:07Z) - Embodied Visual Navigation with Automatic Curriculum Learning in Real
Environments [20.017277077448924]
NavACL is a method of automatic curriculum learning tailored to the navigation task.
Deep reinforcement learning agents trained using NavACL significantly outperform state-of-the-art agents trained with uniform sampling.
Our agents can navigate through unknown cluttered indoor environments to semantically-specified targets using only RGB images.
arXiv Detail & Related papers (2020-09-11T13:28:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.