Building Intelligent Autonomous Navigation Agents
- URL: http://arxiv.org/abs/2106.13415v1
- Date: Fri, 25 Jun 2021 04:10:58 GMT
- Title: Building Intelligent Autonomous Navigation Agents
- Authors: Devendra Singh Chaplot
- Abstract summary: The goal of this thesis is to make progress towards designing algorithms capable of physical intelligence'
In the first part of the thesis, we discuss our work on short-term navigation using end-to-end reinforcement learning.
In the second part, we present a new class of navigation methods based on modular learning and structured explicit map representations.
- Score: 18.310643564200525
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Breakthroughs in machine learning in the last decade have led to `digital
intelligence', i.e. machine learning models capable of learning from vast
amounts of labeled data to perform several digital tasks such as speech
recognition, face recognition, machine translation and so on. The goal of this
thesis is to make progress towards designing algorithms capable of `physical
intelligence', i.e. building intelligent autonomous navigation agents capable
of learning to perform complex navigation tasks in the physical world involving
visual perception, natural language understanding, reasoning, planning, and
sequential decision making. Despite several advances in classical navigation
methods in the last few decades, current navigation agents struggle at
long-term semantic navigation tasks. In the first part of the thesis, we
discuss our work on short-term navigation using end-to-end reinforcement
learning to tackle challenges such as obstacle avoidance, semantic perception,
language grounding, and reasoning. In the second part, we present a new class
of navigation methods based on modular learning and structured explicit map
representations, which leverage the strengths of both classical and end-to-end
learning methods, to tackle long-term navigation tasks. We show that these
methods are able to effectively tackle challenges such as localization,
mapping, long-term planning, exploration and learning semantic priors. These
modular learning methods are capable of long-term spatial and semantic
understanding and achieve state-of-the-art results on various navigation tasks.
Related papers
- MC-GPT: Empowering Vision-and-Language Navigation with Memory Map and Reasoning Chains [4.941781282578696]
In the Vision-and-Language Navigation (VLN) task, the agent is required to navigate to a destination following a natural language instruction.
While learning-based approaches have been a major solution to the task, they suffer from high training costs and lack of interpretability.
Recently, Large Language Models (LLMs) have emerged as a promising tool for VLN due to their strong generalization capabilities.
arXiv Detail & Related papers (2024-05-17T08:33:27Z) - Learning Navigational Visual Representations with Semantic Map
Supervision [85.91625020847358]
We propose a navigational-specific visual representation learning method by contrasting the agent's egocentric views and semantic maps.
Ego$2$-Map learning transfers the compact and rich information from a map, such as objects, structure and transition, to the agent's egocentric representations for navigation.
arXiv Detail & Related papers (2023-07-23T14:01:05Z) - How To Not Train Your Dragon: Training-free Embodied Object Goal
Navigation with Semantic Frontiers [94.46825166907831]
We present a training-free solution to tackle the object goal navigation problem in Embodied AI.
Our method builds a structured scene representation based on the classic visual simultaneous localization and mapping (V-SLAM) framework.
Our method propagates semantics on the scene graphs based on language priors and scene statistics to introduce semantic knowledge to the geometric frontiers.
arXiv Detail & Related papers (2023-05-26T13:38:33Z) - Recent Advancements in Deep Learning Applications and Methods for
Autonomous Navigation: A Comprehensive Review [0.0]
Review article is an attempt to survey all recent AI based techniques used to deal with major functions.
Paper aims to bridge the gap between autonomous navigation and deep learning.
arXiv Detail & Related papers (2023-02-22T01:42:49Z) - Learning Robotic Navigation from Experience: Principles, Methods, and
Recent Results [94.60414567852536]
Real-world navigation presents a complex set of physical challenges that defies simple geometric abstractions.
Machine learning offers a promising way to go beyond geometry and conventional planning.
We present a toolkit for experiential learning of robotic navigation skills that unifies several recent approaches.
arXiv Detail & Related papers (2022-12-13T17:41:58Z) - Navigating to Objects in the Real World [76.1517654037993]
We present a large-scale empirical study of semantic visual navigation methods comparing methods from classical, modular, and end-to-end learning approaches.
We find that modular learning works well in the real world, attaining a 90% success rate.
In contrast, end-to-end learning does not, dropping from 77% simulation to 23% real-world success rate due to a large image domain gap between simulation and reality.
arXiv Detail & Related papers (2022-12-02T01:10:47Z) - PONI: Potential Functions for ObjectGoal Navigation with
Interaction-free Learning [125.22462763376993]
We propose Potential functions for ObjectGoal Navigation with Interaction-free learning (PONI)
PONI disentangles the skills of where to look?' for an object and how to navigate to (x, y)?'
arXiv Detail & Related papers (2022-01-25T01:07:32Z) - Deep Learning for Embodied Vision Navigation: A Survey [108.13766213265069]
"Embodied visual navigation" problem requires an agent to navigate in a 3D environment mainly rely on its first-person observation.
This paper attempts to establish an outline of the current works in the field of embodied visual navigation by providing a comprehensive literature survey.
arXiv Detail & Related papers (2021-07-07T12:09:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.