Deep Learning for Embodied Vision Navigation: A Survey
- URL: http://arxiv.org/abs/2108.04097v4
- Date: Mon, 11 Oct 2021 08:48:18 GMT
- Title: Deep Learning for Embodied Vision Navigation: A Survey
- Authors: Fengda Zhu, Yi Zhu, Vincent CS Lee, Xiaodan Liang and Xiaojun Chang
- Abstract summary: "Embodied visual navigation" problem requires an agent to navigate in a 3D environment mainly rely on its first-person observation.
This paper attempts to establish an outline of the current works in the field of embodied visual navigation by providing a comprehensive literature survey.
- Score: 108.13766213265069
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: "Embodied visual navigation" problem requires an agent to navigate in a 3D
environment mainly rely on its first-person observation. This problem has
attracted rising attention in recent years due to its wide application in
autonomous driving, vacuum cleaner, and rescue robot. A navigation agent is
supposed to have various intelligent skills, such as visual perceiving,
mapping, planning, exploring and reasoning, etc. Building such an agent that
observes, thinks, and acts is a key to real intelligence. The remarkable
learning ability of deep learning methods empowered the agents to accomplish
embodied visual navigation tasks. Despite this, embodied visual navigation is
still in its infancy since a lot of advanced skills are required, including
perceiving partially observed visual input, exploring unseen areas, memorizing
and modeling seen scenarios, understanding cross-modal instructions, and
adapting to a new environment, etc. Recently, embodied visual navigation has
attracted rising attention of the community, and numerous works has been
proposed to learn these skills. This paper attempts to establish an outline of
the current works in the field of embodied visual navigation by providing a
comprehensive literature survey. We summarize the benchmarks and metrics,
review different methods, analysis the challenges, and highlight the
state-of-the-art methods. Finally, we discuss unresolved challenges in the
field of embodied visual navigation and give promising directions in pursuing
future research.
Related papers
- Learning Navigational Visual Representations with Semantic Map
Supervision [85.91625020847358]
We propose a navigational-specific visual representation learning method by contrasting the agent's egocentric views and semantic maps.
Ego$2$-Map learning transfers the compact and rich information from a map, such as objects, structure and transition, to the agent's egocentric representations for navigation.
arXiv Detail & Related papers (2023-07-23T14:01:05Z) - Towards self-attention based visual navigation in the real world [0.0]
Vision guided navigation requires processing complex visual information to inform task-orientated decisions.
Deep Reinforcement Learning agents trained in simulation often exhibit unsatisfactory results when deployed in the real-world.
This is the first demonstration of a self-attention based agent successfully trained in navigating a 3D action space, using less than 4000 parameters.
arXiv Detail & Related papers (2022-09-15T04:51:42Z) - Explore before Moving: A Feasible Path Estimation and Memory Recalling
Framework for Embodied Navigation [117.26891277593205]
We focus on the navigation and solve the problem of existing navigation algorithms lacking experience and common sense.
Inspired by the human ability to think twice before moving and conceive several feasible paths to seek a goal in unfamiliar scenes, we present a route planning method named Path Estimation and Memory Recalling framework.
We show strong experimental results of PEMR on the EmbodiedQA navigation task.
arXiv Detail & Related papers (2021-10-16T13:30:55Z) - Augmented reality navigation system for visual prosthesis [67.09251544230744]
We propose an augmented reality navigation system for visual prosthesis that incorporates a software of reactive navigation and path planning.
It consists on four steps: locating the subject on a map, planning the subject trajectory, showing it to the subject and re-planning without obstacles.
Results show how our augmented navigation system help navigation performance by reducing the time and distance to reach the goals, even significantly reducing the number of obstacles collisions.
arXiv Detail & Related papers (2021-09-30T09:41:40Z) - Building Intelligent Autonomous Navigation Agents [18.310643564200525]
The goal of this thesis is to make progress towards designing algorithms capable of physical intelligence'
In the first part of the thesis, we discuss our work on short-term navigation using end-to-end reinforcement learning.
In the second part, we present a new class of navigation methods based on modular learning and structured explicit map representations.
arXiv Detail & Related papers (2021-06-25T04:10:58Z) - Diagnosing Vision-and-Language Navigation: What Really Matters [61.72935815656582]
Vision-and-language navigation (VLN) is a multimodal task where an agent follows natural language instructions and navigates in visual environments.
Recent studies witness a slow-down in the performance improvements in both indoor and outdoor VLN tasks.
In this work, we conduct a series of diagnostic experiments to unveil agents' focus during navigation.
arXiv Detail & Related papers (2021-03-30T17:59:07Z) - Active Visual Information Gathering for Vision-Language Navigation [115.40768457718325]
Vision-language navigation (VLN) is the task of entailing an agent to carry out navigational instructions inside photo-realistic environments.
One of the key challenges in VLN is how to conduct a robust navigation by mitigating the uncertainty caused by ambiguous instructions and insufficient observation of the environment.
This work draws inspiration from human navigation behavior and endows an agent with an active information gathering ability for a more intelligent VLN policy.
arXiv Detail & Related papers (2020-07-15T23:54:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.