What do navigation agents learn about their environment?
- URL: http://arxiv.org/abs/2206.08500v1
- Date: Fri, 17 Jun 2022 01:33:43 GMT
- Title: What do navigation agents learn about their environment?
- Authors: Kshitij Dwivedi, Gemma Roig, Aniruddha Kembhavi, Roozbeh Mottaghi
- Abstract summary: We introduce the Interpretability System for Embodied agEnts (iSEE) for Point Goal and Object Goal navigation agents.
We use iSEE to probe the dynamic representations produced by these agents for the presence of information about the agent as well as the environment.
- Score: 39.74076893981299
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Today's state of the art visual navigation agents typically consist of large
deep learning models trained end to end. Such models offer little to no
interpretability about the learned skills or the actions of the agent taken in
response to its environment. While past works have explored interpreting deep
learning models, little attention has been devoted to interpreting embodied AI
systems, which often involve reasoning about the structure of the environment,
target characteristics and the outcome of one's actions. In this paper, we
introduce the Interpretability System for Embodied agEnts (iSEE) for Point Goal
and Object Goal navigation agents. We use iSEE to probe the dynamic
representations produced by these agents for the presence of information about
the agent as well as the environment. We demonstrate interesting insights about
navigation agents using iSEE, including the ability to encode reachable
locations (to avoid obstacles), visibility of the target, progress from the
initial spawn location as well as the dramatic effect on the behaviors of
agents when we mask out critical individual neurons. The code is available at:
https://github.com/allenai/iSEE
Related papers
- Interpretable Brain-Inspired Representations Improve RL Performance on
Visual Navigation Tasks [0.0]
We show how the method of slow feature analysis (SFA) overcomes both limitations by generating interpretable representations of visual data.
We employ SFA in a modern reinforcement learning context, analyse and compare representations and illustrate where hierarchical SFA can outperform other feature extractors on navigation tasks.
arXiv Detail & Related papers (2024-02-19T11:35:01Z) - NavHint: Vision and Language Navigation Agent with a Hint Generator [31.322331792911598]
We provide indirect supervision to the navigation agent through a hint generator that provides detailed visual descriptions.
The hint generator assists the navigation agent in developing a global understanding of the visual environment.
We evaluate our method on the R2R and R4R datasets and achieve state-of-the-art on several metrics.
arXiv Detail & Related papers (2024-02-04T16:23:16Z) - Learning Navigational Visual Representations with Semantic Map
Supervision [85.91625020847358]
We propose a navigational-specific visual representation learning method by contrasting the agent's egocentric views and semantic maps.
Ego$2$-Map learning transfers the compact and rich information from a map, such as objects, structure and transition, to the agent's egocentric representations for navigation.
arXiv Detail & Related papers (2023-07-23T14:01:05Z) - Deep Learning for Embodied Vision Navigation: A Survey [108.13766213265069]
"Embodied visual navigation" problem requires an agent to navigate in a 3D environment mainly rely on its first-person observation.
This paper attempts to establish an outline of the current works in the field of embodied visual navigation by providing a comprehensive literature survey.
arXiv Detail & Related papers (2021-07-07T12:09:04Z) - Pushing it out of the Way: Interactive Visual Navigation [62.296686176988125]
We study the problem of interactive navigation where agents learn to change the environment to navigate more efficiently to their goals.
We introduce the Neural Interaction Engine (NIE) to explicitly predict the change in the environment caused by the agent's actions.
By modeling the changes while planning, we find that agents exhibit significant improvements in their navigational capabilities.
arXiv Detail & Related papers (2021-04-28T22:46:41Z) - Visual Navigation with Spatial Attention [26.888916048408895]
This work focuses on object goal visual navigation, aiming at finding the location of an object from a given class.
We propose to learn the agent's policy using a reinforcement learning algorithm.
Our key contribution is a novel attention probability model for visual navigation tasks.
arXiv Detail & Related papers (2021-04-20T07:39:52Z) - Diagnosing Vision-and-Language Navigation: What Really Matters [61.72935815656582]
Vision-and-language navigation (VLN) is a multimodal task where an agent follows natural language instructions and navigates in visual environments.
Recent studies witness a slow-down in the performance improvements in both indoor and outdoor VLN tasks.
In this work, we conduct a series of diagnostic experiments to unveil agents' focus during navigation.
arXiv Detail & Related papers (2021-03-30T17:59:07Z) - Diagnosing the Environment Bias in Vision-and-Language Navigation [102.02103792590076]
Vision-and-Language Navigation (VLN) requires an agent to follow natural-language instructions, explore the given environments, and reach the desired target locations.
Recent works that study VLN observe a significant performance drop when tested on unseen environments, indicating that the neural agent models are highly biased towards training environments.
In this work, we design novel diagnosis experiments via environment re-splitting and feature replacement, looking into possible reasons for this environment bias.
arXiv Detail & Related papers (2020-05-06T19:24:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.