Related papers: Multi-Object Navigation with dynamically learned neural implicit representations

Multi-Object Navigation with dynamically learned neural implicit representations

URL: http://arxiv.org/abs/2210.05129v2
Date: Wed, 27 Sep 2023 11:17:18 GMT
Title: Multi-Object Navigation with dynamically learned neural implicit representations
Authors: Pierre Marza, Laetitia Matignon, Olivier Simonin, Christian Wolf
Abstract summary: We propose to structure neural networks with two neural implicit representations, which are learned dynamically during each episode. We evaluate the agent on Multi-Object Navigation and show the high impact of using neural implicit representations as a memory source.
Score: 10.182418917501064
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Understanding and mapping a new environment are core abilities of any autonomously navigating agent. While classical robotics usually estimates maps in a stand-alone manner with SLAM variants, which maintain a topological or metric representation, end-to-end learning of navigation keeps some form of memory in a neural network. Networks are typically imbued with inductive biases, which can range from vectorial representations to birds-eye metric tensors or topological structures. In this work, we propose to structure neural networks with two neural implicit representations, which are learned dynamically during each episode and map the content of the scene: (i) the Semantic Finder predicts the position of a previously seen queried object; (ii) the Occupancy and Exploration Implicit Representation encapsulates information about explored area and obstacles, and is queried with a novel global read mechanism which directly maps from function space to a usable embedding space. Both representations are leveraged by an agent trained with Reinforcement Learning (RL) and learned online during each episode. We evaluate the agent on Multi-Object Navigation and show the high impact of using neural implicit representations as a memory source.

Related papers

A Role of Environmental Complexity on Representation Learning in Deep Reinforcement Learning Agents [3.7314353481448337]
We developed a simulated navigation environment to train deep reinforcement learning agents. We modulated the frequency of exposure to a shortcut and navigation cue, leading to the development of artificial agents with differing abilities. We examined the encoded representations in artificial neural networks driving these agents, revealing intricate dynamics in representation learning.
arXiv Detail & Related papers (2024-07-03T18:27:26Z)
Augmented Commonsense Knowledge for Remote Object Grounding [67.30864498454805]
We propose an augmented commonsense knowledge model (ACK) to leverage commonsense information as atemporal knowledge graph for improving agent navigation. ACK consists of knowledge graph-aware cross-modal and concept aggregation modules to enhance visual representation and visual-textual data alignment. We add a new pipeline for the commonsense-based decision-making process which leads to more accurate local action prediction.
arXiv Detail & Related papers (2024-06-03T12:12:33Z)
Learning Object-Centric Representation via Reverse Hierarchy Guidance [73.05170419085796]
Object-Centric Learning (OCL) seeks to enable Neural Networks to identify individual objects in visual scenes. RHGNet introduces a top-down pathway that works in different ways in the training and inference processes. Our model achieves SOTA performance on several commonly used datasets.
arXiv Detail & Related papers (2024-05-17T07:48:27Z)
Automated mapping of virtual environments with visual predictive coding [0.9591674293850556]
We introduce a framework in which an agent navigates a virtual environment while engaging in visual predictive coding. While learning a next image prediction task, the agent automatically constructs an internal representation of the environment that quantitatively reflects distances. The internal map enables the agent to pinpoint its location relative to landmarks using only visual information.
arXiv Detail & Related papers (2023-08-20T22:29:16Z)
Learning Navigational Visual Representations with Semantic Map Supervision [85.91625020847358]
We propose a navigational-specific visual representation learning method by contrasting the agent's egocentric views and semantic maps. Ego$2$-Map learning transfers the compact and rich information from a map, such as objects, structure and transition, to the agent's egocentric representations for navigation.
arXiv Detail & Related papers (2023-07-23T14:01:05Z)
Neural Network based Successor Representations of Space and Language [6.748976209131109]
We present a neural network based approach to learn multi-scale successor representations of structured knowledge. In all scenarios, the neural network correctly learns and approximates the underlying structure by building successor representations. We conclude that cognitive maps and neural network-based successor representations of structured knowledge provide a promising way to overcome some of the short comings of deep learning towards artificial general intelligence.
arXiv Detail & Related papers (2022-02-22T21:52:46Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context. We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z)
Understanding the Role of Individual Units in a Deep Neural Network [85.23117441162772]
We present an analytic framework to systematically identify hidden units within image classification and image generation networks. First, we analyze a convolutional neural network (CNN) trained on scene classification and discover units that match a diverse set of object concepts. Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.
arXiv Detail & Related papers (2020-09-10T17:59:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.