Related papers: A model-based approach to meta-Reinforcement Learning: Transformers and tree search

A model-based approach to meta-Reinforcement Learning: Transformers and tree search

URL: http://arxiv.org/abs/2208.11535v1
Date: Wed, 24 Aug 2022 13:30:26 GMT
Title: A model-based approach to meta-Reinforcement Learning: Transformers and tree search
Authors: Brieuc Pinon and Jean-Charles Delvenne and Rapha\"el Jungers
Abstract summary: We show the relevance of model-based approaches with online planning to perform exploration and exploitation successfully in meta-RL. We show the efficiency of the Transformer architecture to learn complex dynamics that arise from latent spaces present in meta-RL problems.
Score: 1.1602089225841632
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Meta-learning is a line of research that develops the ability to leverage past experiences to efficiently solve new learning problems. Meta-Reinforcement Learning (meta-RL) methods demonstrate a capability to learn behaviors that efficiently acquire and exploit information in several meta-RL problems. In this context, the Alchemy benchmark has been proposed by Wang et al. [2021]. Alchemy features a rich structured latent space that is challenging for state-of-the-art model-free RL methods. These methods fail to learn to properly explore then exploit. We develop a model-based algorithm. We train a model whose principal block is a Transformer Encoder to fit the symbolic Alchemy environment dynamics. Then we define an online planner with the learned model using a tree search method. This algorithm significantly outperforms previously applied model-free RL methods on the symbolic Alchemy problem. Our results reveal the relevance of model-based approaches with online planning to perform exploration and exploitation successfully in meta-RL. Moreover, we show the efficiency of the Transformer architecture to learn complex dynamics that arise from latent spaces present in meta-RL problems.

Related papers

How Should We Meta-Learn Reinforcement Learning Algorithms? [74.37180723338591]
We carry out an empirical comparison of the different approaches when applied to a range of meta-learned algorithms.<n>In addition to meta-train and meta-test performance, we also investigate factors including the interpretability, sample cost and train time.<n>We propose several guidelines for meta-learning new RL algorithms which will help ensure that future learned algorithms are as performant as possible.
arXiv Detail & Related papers (2025-07-23T16:31:38Z)
MAMBA: an Effective World Model Approach for Meta-Reinforcement Learning [18.82398325614491]
We propose a new model-based approach to meta-RL, based on elements from existing state-of-the-art model-based and meta-RL methods. We demonstrate the effectiveness of our approach on common meta-RL benchmark domains, attaining greater return with better sample efficiency. In addition, we validate our approach on a slate of more challenging, higher-dimensional domains, taking a step towards real-world generalizing agents.
arXiv Detail & Related papers (2024-03-14T20:40:36Z)
On Task-Relevant Loss Functions in Meta-Reinforcement Learning and Online LQR [9.355903533901023]
We propose a sample-efficient meta-RL algorithm that learns a model of the system or environment at hand in a task-directed manner. As opposed to the standard model-based approaches to meta-RL, our method exploits the value information in order to rapidly capture the decision-critical part of the environment.
arXiv Detail & Related papers (2023-12-09T04:52:28Z)
Data-Efficient Task Generalization via Probabilistic Model-based Meta Reinforcement Learning [58.575939354953526]
PACOH-RL is a novel model-based Meta-Reinforcement Learning (Meta-RL) algorithm designed to efficiently adapt control policies to changing dynamics. Existing Meta-RL methods require abundant meta-learning data, limiting their applicability in settings such as robotics. Our experiment results demonstrate that PACOH-RL outperforms model-based RL and model-based Meta-RL baselines in adapting to new dynamic conditions.
arXiv Detail & Related papers (2023-11-13T18:51:57Z)
Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task. The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance. We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z)
A Survey of Meta-Reinforcement Learning [69.76165430793571]
We cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL. We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task. We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.
arXiv Detail & Related papers (2023-01-19T12:01:41Z)
Implicit Offline Reinforcement Learning via Supervised Learning [83.8241505499762]
Offline Reinforcement Learning (RL) via Supervised Learning is a simple and effective way to learn robotic skills from a dataset collected by policies of different expertise levels. We show how implicit models can leverage return information and match or outperform explicit algorithms to acquire robotic skills from fixed datasets.
arXiv Detail & Related papers (2022-10-21T21:59:42Z)
Meta Reinforcement Learning for Adaptive Control: An Offline Approach [3.131740922192114]
We formulate a meta reinforcement learning (meta-RL) control strategy that takes advantage of known, offline information for training. Our meta-RL agent has a recurrent structure that accumulates "context" for its current dynamics through a hidden state variable. In tests reported here, the meta-RL agent was trained entirely offline, yet produced excellent results in novel settings.
arXiv Detail & Related papers (2022-03-17T23:58:52Z)
REIN-2: Giving Birth to Prepared Reinforcement Learning Agents Using Reinforcement Learning Agents [0.0]
In this paper, we introduce a meta-learning scheme that shifts the objective of learning to solve a task into the objective of learning to learn to solve a task (or a set of tasks) Our model, named REIN-2, is a meta-learning scheme formulated within the RL framework, the goal of which is to develop a meta-RL agent that learns how to produce other RL agents. Compared to traditional state-of-the-art Deep RL algorithms, experimental results show remarkable performance of our model in popular OpenAI Gym environments.
arXiv Detail & Related papers (2021-10-11T10:13:49Z)
Alchemy: A structured task distribution for meta-reinforcement learning [52.75769317355963]
We introduce a new benchmark for meta-RL research, which combines structural richness with structural transparency. Alchemy is a 3D video game, which involves a latent causal structure that is resampled procedurally from episode to episode. We evaluate a pair of powerful RL agents on Alchemy and present an in-depth analysis of one of these agents.
arXiv Detail & Related papers (2021-02-04T23:40:44Z)
MELD: Meta-Reinforcement Learning from Images via Latent State Models [109.1664295663325]
We develop an algorithm for meta-RL from images that performs inference in a latent state model to quickly acquire new skills. MELD is the first meta-RL algorithm trained in a real-world robotic control setting from images.
arXiv Detail & Related papers (2020-10-26T23:50:30Z)
FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization [10.243908145832394]
We study the offline meta-reinforcement learning (OMRL) problem, a paradigm which enables reinforcement learning (RL) algorithms to quickly adapt to unseen tasks. This problem is still not fully understood, for which two major challenges need to be addressed. We provide analysis and insight showing that some simple design choices can yield substantial improvements over recent approaches.
arXiv Detail & Related papers (2020-10-02T17:13:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.