Multi-Object Navigation with dynamically learned neural implicit
representations
- URL: http://arxiv.org/abs/2210.05129v2
- Date: Wed, 27 Sep 2023 11:17:18 GMT
- Title: Multi-Object Navigation with dynamically learned neural implicit
representations
- Authors: Pierre Marza, Laetitia Matignon, Olivier Simonin, Christian Wolf
- Abstract summary: We propose to structure neural networks with two neural implicit representations, which are learned dynamically during each episode.
We evaluate the agent on Multi-Object Navigation and show the high impact of using neural implicit representations as a memory source.
- Score: 10.182418917501064
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding and mapping a new environment are core abilities of any
autonomously navigating agent. While classical robotics usually estimates maps
in a stand-alone manner with SLAM variants, which maintain a topological or
metric representation, end-to-end learning of navigation keeps some form of
memory in a neural network. Networks are typically imbued with inductive
biases, which can range from vectorial representations to birds-eye metric
tensors or topological structures. In this work, we propose to structure neural
networks with two neural implicit representations, which are learned
dynamically during each episode and map the content of the scene: (i) the
Semantic Finder predicts the position of a previously seen queried object; (ii)
the Occupancy and Exploration Implicit Representation encapsulates information
about explored area and obstacles, and is queried with a novel global read
mechanism which directly maps from function space to a usable embedding space.
Both representations are leveraged by an agent trained with Reinforcement
Learning (RL) and learned online during each episode. We evaluate the agent on
Multi-Object Navigation and show the high impact of using neural implicit
representations as a memory source.
Related papers
- A Role of Environmental Complexity on Representation Learning in Deep Reinforcement Learning Agents [3.7314353481448337]
We developed a simulated navigation environment to train deep reinforcement learning agents.
We modulated the frequency of exposure to a shortcut and navigation cue, leading to the development of artificial agents with differing abilities.
We examined the encoded representations in artificial neural networks driving these agents, revealing intricate dynamics in representation learning.
arXiv Detail & Related papers (2024-07-03T18:27:26Z) - Augmented Commonsense Knowledge for Remote Object Grounding [67.30864498454805]
We propose an augmented commonsense knowledge model (ACK) to leverage commonsense information as atemporal knowledge graph for improving agent navigation.
ACK consists of knowledge graph-aware cross-modal and concept aggregation modules to enhance visual representation and visual-textual data alignment.
We add a new pipeline for the commonsense-based decision-making process which leads to more accurate local action prediction.
arXiv Detail & Related papers (2024-06-03T12:12:33Z) - Learning Object-Centric Representation via Reverse Hierarchy Guidance [73.05170419085796]
Object-Centric Learning (OCL) seeks to enable Neural Networks to identify individual objects in visual scenes.
RHGNet introduces a top-down pathway that works in different ways in the training and inference processes.
Our model achieves SOTA performance on several commonly used datasets.
arXiv Detail & Related papers (2024-05-17T07:48:27Z) - Learning Navigational Visual Representations with Semantic Map
Supervision [85.91625020847358]
We propose a navigational-specific visual representation learning method by contrasting the agent's egocentric views and semantic maps.
Ego$2$-Map learning transfers the compact and rich information from a map, such as objects, structure and transition, to the agent's egocentric representations for navigation.
arXiv Detail & Related papers (2023-07-23T14:01:05Z) - Neural Network based Successor Representations of Space and Language [6.748976209131109]
We present a neural network based approach to learn multi-scale successor representations of structured knowledge.
In all scenarios, the neural network correctly learns and approximates the underlying structure by building successor representations.
We conclude that cognitive maps and neural network-based successor representations of structured knowledge provide a promising way to overcome some of the short comings of deep learning towards artificial general intelligence.
arXiv Detail & Related papers (2022-02-22T21:52:46Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - Understanding the Role of Individual Units in a Deep Neural Network [85.23117441162772]
We present an analytic framework to systematically identify hidden units within image classification and image generation networks.
First, we analyze a convolutional neural network (CNN) trained on scene classification and discover units that match a diverse set of object concepts.
Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.
arXiv Detail & Related papers (2020-09-10T17:59:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.