MoCoDA: Model-based Counterfactual Data Augmentation
- URL: http://arxiv.org/abs/2210.11287v1
- Date: Thu, 20 Oct 2022 14:09:48 GMT
- Title: MoCoDA: Model-based Counterfactual Data Augmentation
- Authors: Silviu Pitis, Elliot Creager, Ajay Mandlekar, Animesh Garg
- Abstract summary: We argue that the ability to recognize and use local factorization in transition dynamics is a key element in unlocking the power of multi-object reasoning.
Knowing the local structure also allows us to predict which unseen states and actions this dynamics model will generalize to.
We show that MoCoDA enables RL agents to learn policies that generalize to unseen states and actions.
- Score: 40.878444530293635
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The number of states in a dynamic process is exponential in the number of
objects, making reinforcement learning (RL) difficult in complex, multi-object
domains. For agents to scale to the real world, they will need to react to and
reason about unseen combinations of objects. We argue that the ability to
recognize and use local factorization in transition dynamics is a key element
in unlocking the power of multi-object reasoning. To this end, we show that (1)
known local structure in the environment transitions is sufficient for an
exponential reduction in the sample complexity of training a dynamics model,
and (2) a locally factored dynamics model provably generalizes
out-of-distribution to unseen states and actions. Knowing the local structure
also allows us to predict which unseen states and actions this dynamics model
will generalize to. We propose to leverage these observations in a novel
Model-based Counterfactual Data Augmentation (MoCoDA) framework. MoCoDA applies
a learned locally factored dynamics model to an augmented distribution of
states and actions to generate counterfactual transitions for RL. MoCoDA works
with a broader set of local structures than prior work and allows for direct
control over the augmented training distribution. We show that MoCoDA enables
RL agents to learn policies that generalize to unseen states and actions. We
use MoCoDA to train an offline RL agent to solve an out-of-distribution
robotics manipulation task on which standard offline RL algorithms fail.
Related papers
- Decentralized Transformers with Centralized Aggregation are Sample-Efficient Multi-Agent World Models [106.94827590977337]
We propose a novel world model for Multi-Agent RL (MARL) that learns decentralized local dynamics for scalability.
We also introduce a Perceiver Transformer as an effective solution to enable centralized representation aggregation.
Results on Starcraft Multi-Agent Challenge (SMAC) show that it outperforms strong model-free approaches and existing model-based methods in both sample efficiency and overall performance.
arXiv Detail & Related papers (2024-06-22T12:40:03Z) - Mamba as Decision Maker: Exploring Multi-scale Sequence Modeling in Offline Reinforcement Learning [16.23977055134524]
We propose a novel action predictor sequence, named Mamba Decision Maker (MambaDM)
MambaDM is expected to be a promising alternative for sequence modeling paradigms, owing to its efficient modeling of multi-scale dependencies.
This paper delves into the sequence modeling capabilities of MambaDM in the RL domain, paving the way for future advancements.
arXiv Detail & Related papers (2024-06-04T06:49:18Z) - A Neuromorphic Architecture for Reinforcement Learning from Real-Valued
Observations [0.34410212782758043]
Reinforcement Learning (RL) provides a powerful framework for decision-making in complex environments.
This paper presents a novel Spiking Neural Network (SNN) architecture for solving RL problems with real-valued observations.
arXiv Detail & Related papers (2023-07-06T12:33:34Z) - Causal Dynamics Learning for Task-Independent State Abstraction [61.707048209272884]
We introduce Causal Dynamics Learning for Task-Independent State Abstraction (CDL)
CDL learns a theoretically proved causal dynamics model that removes unnecessary dependencies between state variables and the action.
A state abstraction can then be derived from the learned dynamics.
arXiv Detail & Related papers (2022-06-27T17:02:53Z) - Model-Invariant State Abstractions for Model-Based Reinforcement
Learning [54.616645151708994]
We introduce a new type of state abstraction called textitmodel-invariance.
This allows for generalization to novel combinations of unseen values of state variables.
We prove that an optimal policy can be learned over this model-invariance state abstraction.
arXiv Detail & Related papers (2021-02-19T10:37:54Z) - Counterfactual Data Augmentation using Locally Factored Dynamics [44.37487079747397]
Local causal structures can be leveraged to improve the sample efficiency of sequence prediction and off-policy reinforcement learning.
We propose an approach to inferring these structures given an object-oriented state representation, as well as a novel algorithm for Counterfactual Data Augmentation.
arXiv Detail & Related papers (2020-07-06T16:29:00Z) - Context-aware Dynamics Model for Generalization in Model-Based
Reinforcement Learning [124.9856253431878]
We decompose the task of learning a global dynamics model into two stages: (a) learning a context latent vector that captures the local dynamics, then (b) predicting the next state conditioned on it.
In order to encode dynamics-specific information into the context latent vector, we introduce a novel loss function that encourages the context latent vector to be useful for predicting both forward and backward dynamics.
The proposed method achieves superior generalization ability across various simulated robotics and control tasks, compared to existing RL schemes.
arXiv Detail & Related papers (2020-05-14T08:10:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.