Emergence of Maps in the Memories of Blind Navigation Agents
- URL: http://arxiv.org/abs/2301.13261v1
- Date: Mon, 30 Jan 2023 20:09:39 GMT
- Title: Emergence of Maps in the Memories of Blind Navigation Agents
- Authors: Erik Wijmans, Manolis Savva, Irfan Essa, Stefan Lee, Ari S. Morcos,
Dhruv Batra
- Abstract summary: Animal navigation research posits that organisms build and maintain internal spatial representations, or maps, of their environment.
We ask if machines -- specifically, artificial intelligence (AI) navigation agents -- also build implicit (or'mental') maps.
Unlike animal navigation, we can judiciously design the agent's perceptual system and control the learning paradigm to nullify alternative navigation mechanisms.
- Score: 68.41901534985575
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Animal navigation research posits that organisms build and maintain internal
spatial representations, or maps, of their environment. We ask if machines --
specifically, artificial intelligence (AI) navigation agents -- also build
implicit (or 'mental') maps. A positive answer to this question would (a)
explain the surprising phenomenon in recent literature of ostensibly map-free
neural-networks achieving strong performance, and (b) strengthen the evidence
of mapping as a fundamental mechanism for navigation by intelligent embodied
agents, whether they be biological or artificial. Unlike animal navigation, we
can judiciously design the agent's perceptual system and control the learning
paradigm to nullify alternative navigation mechanisms. Specifically, we train
'blind' agents -- with sensing limited to only egomotion and no other sensing
of any kind -- to perform PointGoal navigation ('go to $\Delta$ x, $\Delta$ y')
via reinforcement learning. Our agents are composed of navigation-agnostic
components (fully-connected and recurrent neural networks), and our
experimental setup provides no inductive bias towards mapping. Despite these
harsh conditions, we find that blind agents are (1) surprisingly effective
navigators in new environments (~95% success); (2) they utilize memory over
long horizons (remembering ~1,000 steps of past experience in an episode); (3)
this memory enables them to exhibit intelligent behavior (following walls,
detecting collisions, taking shortcuts); (4) there is emergence of maps and
collision detection neurons in the representations of the environment built by
a blind agent as it navigates; and (5) the emergent maps are selective and task
dependent (e.g. the agent 'forgets' exploratory detours). Overall, this paper
presents no new techniques for the AI audience, but a surprising finding, an
insight, and an explanation.
Related papers
- Visuospatial navigation without distance, prediction, or maps [1.3812010983144802]
We show the sufficiency of a minimal feedforward framework in a classic visual navigation task.
While visual distance enables direct trajectories to the goal, two distinct algorithms develop to robustly navigate using visual angles alone.
Each of the three confers unique contextual tradeoffs as well as aligns with movement behavior observed in rodents, insects, fish, and sperm cells.
arXiv Detail & Related papers (2024-07-18T14:07:44Z) - Learning Navigational Visual Representations with Semantic Map
Supervision [85.91625020847358]
We propose a navigational-specific visual representation learning method by contrasting the agent's egocentric views and semantic maps.
Ego$2$-Map learning transfers the compact and rich information from a map, such as objects, structure and transition, to the agent's egocentric representations for navigation.
arXiv Detail & Related papers (2023-07-23T14:01:05Z) - Investigating Navigation Strategies in the Morris Water Maze through
Deep Reinforcement Learning [4.408196554639971]
In this work, we simulate the Morris Water Maze in 2D to train deep reinforcement learning agents.
We perform automatic classification of navigation strategies, analyze the distribution of strategies used by artificial agents, and compare them with experimental data to show similar learning dynamics as those seen in humans and rodents.
arXiv Detail & Related papers (2023-06-01T18:16:16Z) - What do navigation agents learn about their environment? [39.74076893981299]
We introduce the Interpretability System for Embodied agEnts (iSEE) for Point Goal and Object Goal navigation agents.
We use iSEE to probe the dynamic representations produced by these agents for the presence of information about the agent as well as the environment.
arXiv Detail & Related papers (2022-06-17T01:33:43Z) - Deep Learning for Embodied Vision Navigation: A Survey [108.13766213265069]
"Embodied visual navigation" problem requires an agent to navigate in a 3D environment mainly rely on its first-person observation.
This paper attempts to establish an outline of the current works in the field of embodied visual navigation by providing a comprehensive literature survey.
arXiv Detail & Related papers (2021-07-07T12:09:04Z) - Pushing it out of the Way: Interactive Visual Navigation [62.296686176988125]
We study the problem of interactive navigation where agents learn to change the environment to navigate more efficiently to their goals.
We introduce the Neural Interaction Engine (NIE) to explicitly predict the change in the environment caused by the agent's actions.
By modeling the changes while planning, we find that agents exhibit significant improvements in their navigational capabilities.
arXiv Detail & Related papers (2021-04-28T22:46:41Z) - Diagnosing Vision-and-Language Navigation: What Really Matters [61.72935815656582]
Vision-and-language navigation (VLN) is a multimodal task where an agent follows natural language instructions and navigates in visual environments.
Recent studies witness a slow-down in the performance improvements in both indoor and outdoor VLN tasks.
In this work, we conduct a series of diagnostic experiments to unveil agents' focus during navigation.
arXiv Detail & Related papers (2021-03-30T17:59:07Z) - Active Visual Information Gathering for Vision-Language Navigation [115.40768457718325]
Vision-language navigation (VLN) is the task of entailing an agent to carry out navigational instructions inside photo-realistic environments.
One of the key challenges in VLN is how to conduct a robust navigation by mitigating the uncertainty caused by ambiguous instructions and insufficient observation of the environment.
This work draws inspiration from human navigation behavior and endows an agent with an active information gathering ability for a more intelligent VLN policy.
arXiv Detail & Related papers (2020-07-15T23:54:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.