Feature-Attending Recurrent Modules for Generalization in Reinforcement
Learning
- URL: http://arxiv.org/abs/2112.08369v3
- Date: Fri, 3 Nov 2023 15:12:28 GMT
- Title: Feature-Attending Recurrent Modules for Generalization in Reinforcement
Learning
- Authors: Wilka Carvalho, Andrew Lampinen, Kyriacos Nikiforou, Felix Hill,
Murray Shanahan
- Abstract summary: "Feature- Recurrent Modules" (FARM) is an architecture for learning state representations that relies on simple, broadly applicable inductive biases for spatial and temporal regularities.
FARM learns a state representation that is distributed across multiple modules that each attend to capturing features with an expressive feature attention mechanism.
We show that this improves an RL agents ability to generalize across object-centric tasks.
- Score: 27.736730414205137
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many important tasks are defined in terms of object. To generalize across
these tasks, a reinforcement learning (RL) agent needs to exploit the structure
that the objects induce. Prior work has either hard-coded object-centric
features, used complex object-centric generative models, or updated state using
local spatial features. However, these approaches have had limited success in
enabling general RL agents. Motivated by this, we introduce "Feature-Attending
Recurrent Modules" (FARM), an architecture for learning state representations
that relies on simple, broadly applicable inductive biases for capturing
spatial and temporal regularities. FARM learns a state representation that is
distributed across multiple modules that each attend to spatiotemporal features
with an expressive feature attention mechanism. We show that this improves an
RL agent's ability to generalize across object-centric tasks. We study task
suites in both 2D and 3D environments and find that FARM better generalizes
compared to competing architectures that leverage attention or multiple
modules.
Related papers
- DistFormer: Enhancing Local and Global Features for Monocular Per-Object
Distance Estimation [35.6022448037063]
Per-object distance estimation is crucial in safety-critical applications such as autonomous driving, surveillance, and robotics.
Existing approaches rely on two scales: local information (i.e., the bounding box proportions) or global information.
Our work aims to strengthen both local and global cues.
arXiv Detail & Related papers (2024-01-06T10:56:36Z) - General-Purpose Multimodal Transformer meets Remote Sensing Semantic
Segmentation [35.100738362291416]
Multimodal AI seeks to exploit complementary data sources, particularly for complex tasks like semantic segmentation.
Recent trends in general-purpose multimodal networks have shown great potential to achieve state-of-the-art performance.
We propose a UNet-inspired module that employs 3D convolution to encode vital local information and learn cross-modal features simultaneously.
arXiv Detail & Related papers (2023-07-07T04:58:34Z) - INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL [90.06845886194235]
We propose a modified objective for model-based reinforcement learning (RL)
We integrate a term inspired by variational empowerment into a state-space model based on mutual information.
We evaluate the approach on a suite of vision-based robot control tasks with natural video backgrounds.
arXiv Detail & Related papers (2022-04-18T23:09:23Z) - Complex-Valued Autoencoders for Object Discovery [62.26260974933819]
We propose a distributed approach to object-centric representations: the Complex AutoEncoder.
We show that this simple and efficient approach achieves better reconstruction performance than an equivalent real-valued autoencoder on simple multi-object datasets.
We also show that it achieves competitive unsupervised object discovery performance to a SlotAttention model on two datasets, and manages to disentangle objects in a third dataset where SlotAttention fails - all while being 7-70 times faster to train.
arXiv Detail & Related papers (2022-04-05T09:25:28Z) - Self-supervised Visual Reinforcement Learning with Object-centric
Representations [11.786249372283562]
We propose to use object-centric representations as a modular and structured observation space.
We show that the structure in the representations in combination with goal-conditioned attention policies helps the autonomous agent to discover and learn useful skills.
arXiv Detail & Related papers (2020-11-29T14:55:09Z) - Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a
First-person Simulated 3D Environment [73.9469267445146]
First-person object-interaction tasks in high-fidelity, 3D, simulated environments such as the AI2Thor pose significant sample-efficiency challenges for reinforcement learning agents.
We show that one can learn object-interaction tasks from scratch without supervision by learning an attentive object-model as an auxiliary task.
arXiv Detail & Related papers (2020-10-28T19:27:26Z) - Learning Robust State Abstractions for Hidden-Parameter Block MDPs [55.31018404591743]
We leverage ideas of common structure from the HiP-MDP setting to enable robust state abstractions inspired by Block MDPs.
We derive instantiations of this new framework for both multi-task reinforcement learning (MTRL) and meta-reinforcement learning (Meta-RL) settings.
arXiv Detail & Related papers (2020-07-14T17:25:27Z) - S2RMs: Spatially Structured Recurrent Modules [105.0377129434636]
We take a step towards exploiting dynamic structure that are capable of simultaneously exploiting both modular andtemporal structures.
We find our models to be robust to the number of available views and better capable of generalization to novel tasks without additional training.
arXiv Detail & Related papers (2020-07-13T17:44:30Z) - Look-into-Object: Self-supervised Structure Modeling for Object
Recognition [71.68524003173219]
We propose to "look into object" (explicitly yet intrinsically model the object structure) through incorporating self-supervisions.
We show the recognition backbone can be substantially enhanced for more robust representation learning.
Our approach achieves large performance gain on a number of benchmarks, including generic object recognition (ImageNet) and fine-grained object recognition tasks (CUB, Cars, Aircraft)
arXiv Detail & Related papers (2020-03-31T12:22:51Z) - Deep Sets for Generalization in RL [15.092941080981706]
This paper investigates the idea of encoding object-centered representations in the design of the reward function and policy architectures of a language-guided reinforcement learning agent.
In a 2D procedurally-generated world where agents targeting goals in natural language navigate and interact with objects, we show that these architectures demonstrate strong generalization capacities to out-of-distribution goals.
arXiv Detail & Related papers (2020-03-20T18:22:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.