Investigating the role of model-based learning in exploration and
transfer
- URL: http://arxiv.org/abs/2302.04009v1
- Date: Wed, 8 Feb 2023 11:49:58 GMT
- Title: Investigating the role of model-based learning in exploration and
transfer
- Authors: Jacob Walker, Eszter V\'ertes, Yazhe Li, Gabriel Dulac-Arnold, Ankesh
Anand, Th\'eophane Weber, Jessica B. Hamrick
- Abstract summary: In this paper, we investigate transfer learning in the context of model-based agents.
We find that a model-based approach outperforms controlled model-free baselines for transfer learning.
Our results show that intrinsic exploration combined with environment models present a viable direction towards agents that are self-supervised and able to generalize to novel reward functions.
- Score: 11.652741003589027
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: State of the art reinforcement learning has enabled training agents on tasks
of ever increasing complexity. However, the current paradigm tends to favor
training agents from scratch on every new task or on collections of tasks with
a view towards generalizing to novel task configurations. The former suffers
from poor data efficiency while the latter is difficult when test tasks are
out-of-distribution. Agents that can effectively transfer their knowledge about
the world pose a potential solution to these issues. In this paper, we
investigate transfer learning in the context of model-based agents.
Specifically, we aim to understand when exactly environment models have an
advantage and why. We find that a model-based approach outperforms controlled
model-free baselines for transfer learning. Through ablations, we show that
both the policy and dynamics model learnt through exploration matter for
successful transfer. We demonstrate our results across three domains which vary
in their requirements for transfer: in-distribution procedural (Crafter),
in-distribution identical (RoboDesk), and out-of-distribution (Meta-World). Our
results show that intrinsic exploration combined with environment models
present a viable direction towards agents that are self-supervised and able to
generalize to novel reward functions.
Related papers
- MergeNet: Knowledge Migration across Heterogeneous Models, Tasks, and Modalities [72.68829963458408]
We present MergeNet, which learns to bridge the gap of parameter spaces of heterogeneous models.
The core mechanism of MergeNet lies in the parameter adapter, which operates by querying the source model's low-rank parameters.
MergeNet is learned alongside both models, allowing our framework to dynamically transfer and adapt knowledge relevant to the current stage.
arXiv Detail & Related papers (2024-04-20T08:34:39Z) - Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task.
The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance.
We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z) - Self-Supervised Reinforcement Learning that Transfers using Random
Features [41.00256493388967]
We propose a self-supervised reinforcement learning method that enables the transfer of behaviors across tasks with different rewards.
Our method is self-supervised in that it can be trained on offline datasets without reward labels, but can then be quickly deployed on new tasks.
arXiv Detail & Related papers (2023-05-26T20:37:06Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Online reinforcement learning with sparse rewards through an active
inference capsule [62.997667081978825]
This paper introduces an active inference agent which minimizes the novel free energy of the expected future.
Our model is capable of solving sparse-reward problems with a very high sample efficiency.
We also introduce a novel method for approximating the prior model from the reward function, which simplifies the expression of complex objectives.
arXiv Detail & Related papers (2021-06-04T10:03:36Z) - Feature-Based Interpretable Reinforcement Learning based on
State-Transition Models [3.883460584034766]
Growing concerns regarding the operational usage of AI models in the real-world has caused a surge of interest in explaining AI models' decisions to humans.
We propose a method for offering local explanations on risk in reinforcement learning.
arXiv Detail & Related papers (2021-05-14T23:43:11Z) - Which Model to Transfer? Finding the Needle in the Growing Haystack [27.660318887140203]
We provide a formalization of this problem through a familiar notion of regret.
We show that both task-agnostic and task-aware methods can yield high regret.
We then propose a simple and efficient hybrid search strategy which outperforms the existing approaches.
arXiv Detail & Related papers (2020-10-13T14:00:22Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z) - Meta-Reinforcement Learning Robust to Distributional Shift via Model
Identification and Experience Relabeling [126.69933134648541]
We present a meta-reinforcement learning algorithm that is both efficient and extrapolates well when faced with out-of-distribution tasks at test time.
Our method is based on a simple insight: we recognize that dynamics models can be adapted efficiently and consistently with off-policy data.
arXiv Detail & Related papers (2020-06-12T13:34:46Z) - Meta Adaptation using Importance Weighted Demonstrations [19.37671674146514]
In some cases, the distribution shifts, so much, that it is difficult for an agent to infer the new task.
We propose a novel algorithm to generalize on any related task by leveraging prior knowledge on a set of specific tasks.
We show experiments where the robot is trained from a diversity of environmental tasks and is also able to adapt to an unseen environment.
arXiv Detail & Related papers (2019-11-23T07:22:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.