Procedural generation of meta-reinforcement learning tasks
- URL: http://arxiv.org/abs/2302.05583v2
- Date: Sat, 9 Dec 2023 04:52:58 GMT
- Title: Procedural generation of meta-reinforcement learning tasks
- Authors: Thomas Miconi
- Abstract summary: We describe a parametrized space for simple meta-reinforcement learning (meta-RL) tasks with arbitrary stimuli.
The parametrization is expressive enough to include many well-known meta-RL tasks, such as bandit problems, the Harlow task, T-mazes, the Daw two-step task and others.
We describe a number of randomly generated meta-RL domains of varying complexity and discuss potential issues arising from random generation.
- Score: 1.2328446298523066
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Open-endedness stands to benefit from the ability to generate an infinite
variety of diverse, challenging environments. One particularly interesting type
of challenge is meta-learning ("learning-to-learn"), a hallmark of intelligent
behavior. However, the number of meta-learning environments in the literature
is limited. Here we describe a parametrized space for simple meta-reinforcement
learning (meta-RL) tasks with arbitrary stimuli. The parametrization allows us
to randomly generate an arbitrary number of novel simple meta-learning tasks.
The parametrization is expressive enough to include many well-known meta-RL
tasks, such as bandit problems, the Harlow task, T-mazes, the Daw two-step task
and others. Simple extensions allow it to capture tasks based on
two-dimensional topological spaces, such as full mazes or find-the-spot
domains. We describe a number of randomly generated meta-RL domains of varying
complexity and discuss potential issues arising from random generation.
Related papers
- AMAGO-2: Breaking the Multi-Task Barrier in Meta-Reinforcement Learning with Transformers [28.927809804613215]
We build upon recent advancements in Transformer-based (in-context) meta-RL.
We evaluate a simple yet scalable solution where both an agent's actor and critic objectives are converted to classification terms.
This design unlocks significant progress in online multi-task adaptation and memory problems without explicit task labels.
arXiv Detail & Related papers (2024-11-17T22:25:40Z) - Task-Aware Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning [70.96345405979179]
The purpose of offline multi-task reinforcement learning (MTRL) is to develop a unified policy applicable to diverse tasks without the need for online environmental interaction.
variations in task content and complexity pose significant challenges in policy formulation.
We introduce the Harmony Multi-Task Decision Transformer (HarmoDT), a novel solution designed to identify an optimal harmony subspace of parameters for each task.
arXiv Detail & Related papers (2024-11-02T05:49:14Z) - MetaModulation: Learning Variational Feature Hierarchies for Few-Shot
Learning with Fewer Tasks [63.016244188951696]
We propose a method for few-shot learning with fewer tasks, which is by metaulation.
We modify parameters at various batch levels to increase the meta-training tasks.
We also introduce learning variational feature hierarchies by incorporating the variationalulation.
arXiv Detail & Related papers (2023-05-17T15:47:47Z) - Fully Online Meta-Learning Without Task Boundaries [80.09124768759564]
We study how meta-learning can be applied to tackle online problems of this nature.
We propose a Fully Online Meta-Learning (FOML) algorithm, which does not require any ground truth knowledge about the task boundaries.
Our experiments show that FOML was able to learn new tasks faster than the state-of-the-art online learning methods.
arXiv Detail & Related papers (2022-02-01T07:51:24Z) - Meta Automatic Curriculum Learning [35.13646854355393]
We introduce the concept of Meta-ACL, and formalize it in the context of black-box RL learners.
We present AGAIN, a first instantiation of Meta-ACL, and showcase its benefits for curriculum generation over classical ACL.
arXiv Detail & Related papers (2020-11-16T14:56:42Z) - Adaptive Task Sampling for Meta-Learning [79.61146834134459]
Key idea of meta-learning for few-shot classification is to mimic the few-shot situations faced at test time.
We propose an adaptive task sampling method to improve the generalization performance.
arXiv Detail & Related papers (2020-07-17T03:15:53Z) - Task-similarity Aware Meta-learning through Nonparametric Kernel
Regression [8.801367758434335]
This paper investigates the use of nonparametric kernel-regression to obtain a tasksimilarity aware meta-learning algorithm.
Our hypothesis is that the use of tasksimilarity helps meta-learning when the available tasks are limited and may contain outlier/ dissimilar tasks.
arXiv Detail & Related papers (2020-06-12T14:15:11Z) - Learning to Learn to Disambiguate: Meta-Learning for Few-Shot Word Sense
Disambiguation [26.296412053816233]
We propose a meta-learning framework for few-shot word sense disambiguation.
The goal is to learn to disambiguate unseen words from only a few labeled instances.
We extend several popular meta-learning approaches to this scenario, and analyze their strengths and weaknesses.
arXiv Detail & Related papers (2020-04-29T17:33:31Z) - Meta-Learning across Meta-Tasks for Few-Shot Learning [107.44950540552765]
We argue that the inter-meta-task relationships should be exploited and those tasks are sampled strategically to assist in meta-learning.
We consider the relationships defined over two types of meta-task pairs and propose different strategies to exploit them.
arXiv Detail & Related papers (2020-02-11T09:25:13Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.