CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and
Transfer Learning
- URL: http://arxiv.org/abs/2010.04296v2
- Date: Tue, 24 Nov 2020 16:05:50 GMT
- Title: CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and
Transfer Learning
- Authors: Ossama Ahmed and Frederik Tr\"auble and Anirudh Goyal and Alexander
Neitz and Yoshua Bengio and Bernhard Sch\"olkopf and Manuel W\"uthrich and
Stefan Bauer
- Abstract summary: CausalWorld is a benchmark for causal structure and transfer learning in a robotic manipulation environment.
Tasks consist of constructing 3D shapes from a given set of blocks - inspired by how children learn to build complex structures.
- Score: 138.40338621974954
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite recent successes of reinforcement learning (RL), it remains a
challenge for agents to transfer learned skills to related environments. To
facilitate research addressing this problem, we propose CausalWorld, a
benchmark for causal structure and transfer learning in a robotic manipulation
environment. The environment is a simulation of an open-source robotic
platform, hence offering the possibility of sim-to-real transfer. Tasks consist
of constructing 3D shapes from a given set of blocks - inspired by how children
learn to build complex structures. The key strength of CausalWorld is that it
provides a combinatorial family of such tasks with common causal structure and
underlying factors (including, e.g., robot and object masses, colors, sizes).
The user (or the agent) may intervene on all causal variables, which allows for
fine-grained control over how similar different tasks (or task distributions)
are. One can thus easily define training and evaluation distributions of a
desired difficulty level, targeting a specific form of generalization (e.g.,
only changes in appearance or object mass). Further, this common
parametrization facilitates defining curricula by interpolating between an
initial and a target task. While users may define their own task distributions,
we present eight meaningful distributions as concrete benchmarks, ranging from
simple to very challenging, all of which require long-horizon planning as well
as precise low-level motor control. Finally, we provide baseline results for a
subset of these tasks on distinct training curricula and corresponding
evaluation protocols, verifying the feasibility of the tasks in this benchmark.
Related papers
- A Unified Causal View of Instruction Tuning [76.1000380429553]
We develop a meta Structural Causal Model (meta-SCM) to integrate different NLP tasks under a single causal structure of the data.
Key idea is to learn task-required causal factors and only use those to make predictions for a given task.
arXiv Detail & Related papers (2024-02-09T07:12:56Z) - Building Minimal and Reusable Causal State Abstractions for
Reinforcement Learning [63.58935783293342]
Causal Bisimulation Modeling (CBM) is a method that learns the causal relationships in the dynamics and reward functions for each task to derive a minimal, task-specific abstraction.
CBM's learned implicit dynamics models identify the underlying causal relationships and state abstractions more accurately than explicit ones.
arXiv Detail & Related papers (2024-01-23T05:43:15Z) - Learning Top-k Subtask Planning Tree based on Discriminative Representation Pre-training for Decision Making [9.302910360945042]
Planning with prior knowledge extracted from complicated real-world tasks is crucial for humans to make accurate decisions.
We introduce a multiple-encoder and individual-predictor regime to learn task-essential representations from sufficient data for simple subtasks.
We also use the attention mechanism to generate a top-k subtask planning tree, which customizes subtask execution plans in guiding complex decisions on unseen tasks.
arXiv Detail & Related papers (2023-12-18T09:00:31Z) - Reinforcement Learning with Success Induced Task Prioritization [68.8204255655161]
We introduce Success Induced Task Prioritization (SITP), a framework for automatic curriculum learning.
The algorithm selects the order of tasks that provide the fastest learning for agents.
We demonstrate that SITP matches or surpasses the results of other curriculum design methods.
arXiv Detail & Related papers (2022-12-30T12:32:43Z) - Unveiling Transformers with LEGO: a synthetic reasoning task [23.535488809197787]
We study how the transformer architecture learns to follow a chain of reasoning.
In some data regime the trained transformer finds "shortcut" solutions to follow the chain of reasoning.
We find that one can prevent such shortcut with appropriate architecture modification or careful data preparation.
arXiv Detail & Related papers (2022-06-09T06:30:17Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Policy Architectures for Compositional Generalization in Control [71.61675703776628]
We introduce a framework for modeling entity-based compositional structure in tasks.
Our policies are flexible and can be trained end-to-end without requiring any action primitives.
arXiv Detail & Related papers (2022-03-10T06:44:24Z) - Graph-based Reinforcement Learning meets Mixed Integer Programs: An
application to 3D robot assembly discovery [34.25379651790627]
We tackle the problem of building arbitrary, predefined target structures entirely from scratch using a set of Tetris-like building blocks and a robotic manipulator.
Our novel hierarchical approach aims at efficiently decomposing the overall task into three feasible levels that benefit mutually from each other.
arXiv Detail & Related papers (2022-03-08T14:44:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.