Provably Efficient Causal Model-Based Reinforcement Learning for
Systematic Generalization
- URL: http://arxiv.org/abs/2202.06545v3
- Date: Thu, 30 Mar 2023 12:25:04 GMT
- Title: Provably Efficient Causal Model-Based Reinforcement Learning for
Systematic Generalization
- Authors: Mirco Mutti, Riccardo De Santi, Emanuele Rossi, Juan Felipe Calderon,
Michael Bronstein, Marcello Restelli
- Abstract summary: In the sequential decision making setting, an agent aims to achieve systematic generalization over a large, possibly infinite, set of environments.
In this paper, we provide a tractable formulation of systematic generalization by employing a causal viewpoint.
Under specific structural assumptions, we provide a simple learning algorithm that guarantees any desired planning error up to an unavoidable sub-optimality term.
- Score: 30.456180468318305
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the sequential decision making setting, an agent aims to achieve
systematic generalization over a large, possibly infinite, set of environments.
Such environments are modeled as discrete Markov decision processes with both
states and actions represented through a feature vector. The underlying
structure of the environments allows the transition dynamics to be factored
into two components: one that is environment-specific and another that is
shared. Consider a set of environments that share the laws of motion as an
example. In this setting, the agent can take a finite amount of reward-free
interactions from a subset of these environments. The agent then must be able
to approximately solve any planning task defined over any environment in the
original set, relying on the above interactions only. Can we design a provably
efficient algorithm that achieves this ambitious goal of systematic
generalization? In this paper, we give a partially positive answer to this
question. First, we provide a tractable formulation of systematic
generalization by employing a causal viewpoint. Then, under specific structural
assumptions, we provide a simple learning algorithm that guarantees any desired
planning error up to an unavoidable sub-optimality term, while showcasing a
polynomial sample complexity.
Related papers
- Intrinsically Motivated Hierarchical Policy Learning in Multi-objective
Markov Decision Processes [15.50007257943931]
We propose a novel dual-phase intrinsically motivated reinforcement learning method to address this limitation.
We show experimentally that the proposed method significantly outperforms state-of-the-art multi-objective reinforcement methods in a dynamic robotics environment.
arXiv Detail & Related papers (2023-08-18T02:10:45Z) - Constrained Environment Optimization for Prioritized Multi-Agent
Navigation [11.473177123332281]
This paper aims to consider the environment as a decision variable in a system-level optimization problem.
We propose novel problems of unprioritized and prioritized environment optimization.
We show, through formal proofs, under which conditions the environment can change while guaranteeing completeness.
arXiv Detail & Related papers (2023-05-18T18:55:06Z) - On the Complexity of Multi-Agent Decision Making: From Learning in Games
to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees.
We study this question in a general framework for interactive decision making with multiple agents.
We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z) - Factorization of Multi-Agent Sampling-Based Motion Planning [72.42734061131569]
Modern robotics often involves multiple embodied agents operating within a shared environment.
Standard sampling-based algorithms can be used to search for solutions in the robots' joint space.
We integrate the concept of factorization into sampling-based algorithms, which requires only minimal modifications to existing methods.
We present a general implementation of a factorized SBA, derive an analytical gain in terms of sample complexity for PRM*, and showcase empirical results for RRG.
arXiv Detail & Related papers (2023-04-01T15:50:18Z) - CausalCity: Complex Simulations with Agency for Causal Discovery and
Reasoning [68.74447489372037]
We present a high-fidelity simulation environment that is designed for developing algorithms for causal discovery and counterfactual reasoning.
A core component of our work is to introduce textitagency, such that it is simple to define and create complex scenarios.
We perform experiments with three state-of-the-art methods to create baselines and highlight the affordances of this environment.
arXiv Detail & Related papers (2021-06-25T00:21:41Z) - CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and
Transfer Learning [138.40338621974954]
CausalWorld is a benchmark for causal structure and transfer learning in a robotic manipulation environment.
Tasks consist of constructing 3D shapes from a given set of blocks - inspired by how children learn to build complex structures.
arXiv Detail & Related papers (2020-10-08T23:01:13Z) - The Advantage of Conditional Meta-Learning for Biased Regularization and
Fine-Tuning [50.21341246243422]
Biased regularization and fine-tuning are two recent meta-learning approaches.
We propose conditional meta-learning, inferring a conditioning function mapping task's side information into a meta- parameter vector.
We then propose a convex meta-algorithm providing a comparable advantage also in practice.
arXiv Detail & Related papers (2020-08-25T07:32:16Z) - Invariant Causal Prediction for Block MDPs [106.63346115341862]
Generalization across environments is critical to the successful application of reinforcement learning algorithms to real-world challenges.
We propose a method of invariant prediction to learn model-irrelevance state abstractions (MISA) that generalize to novel observations in the multi-environment setting.
arXiv Detail & Related papers (2020-03-12T21:03:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.