Related papers: Procedural generation of meta-reinforcement learning tasks

Procedural generation of meta-reinforcement learning tasks

URL: http://arxiv.org/abs/2302.05583v2
Date: Sat, 9 Dec 2023 04:52:58 GMT
Title: Procedural generation of meta-reinforcement learning tasks
Authors: Thomas Miconi
Abstract summary: We describe a parametrized space for simple meta-reinforcement learning (meta-RL) tasks with arbitrary stimuli. The parametrization is expressive enough to include many well-known meta-RL tasks, such as bandit problems, the Harlow task, T-mazes, the Daw two-step task and others. We describe a number of randomly generated meta-RL domains of varying complexity and discuss potential issues arising from random generation.
Score: 1.2328446298523066
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Open-endedness stands to benefit from the ability to generate an infinite variety of diverse, challenging environments. One particularly interesting type of challenge is meta-learning ("learning-to-learn"), a hallmark of intelligent behavior. However, the number of meta-learning environments in the literature is limited. Here we describe a parametrized space for simple meta-reinforcement learning (meta-RL) tasks with arbitrary stimuli. The parametrization allows us to randomly generate an arbitrary number of novel simple meta-learning tasks. The parametrization is expressive enough to include many well-known meta-RL tasks, such as bandit problems, the Harlow task, T-mazes, the Daw two-step task and others. Simple extensions allow it to capture tasks based on two-dimensional topological spaces, such as full mazes or find-the-spot domains. We describe a number of randomly generated meta-RL domains of varying complexity and discuss potential issues arising from random generation.

Related papers

AMAGO-2: Breaking the Multi-Task Barrier in Meta-Reinforcement Learning with Transformers [28.927809804613215]
We build upon recent advancements in Transformer-based (in-context) meta-RL. We evaluate a simple yet scalable solution where both an agent's actor and critic objectives are converted to classification terms. This design unlocks significant progress in online multi-task adaptation and memory problems without explicit task labels.
arXiv Detail & Related papers (2024-11-17T22:25:40Z)
Task-Aware Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning [70.96345405979179]
The purpose of offline multi-task reinforcement learning (MTRL) is to develop a unified policy applicable to diverse tasks without the need for online environmental interaction. variations in task content and complexity pose significant challenges in policy formulation. We introduce the Harmony Multi-Task Decision Transformer (HarmoDT), a novel solution designed to identify an optimal harmony subspace of parameters for each task.
arXiv Detail & Related papers (2024-11-02T05:49:14Z)
MetaModulation: Learning Variational Feature Hierarchies for Few-Shot Learning with Fewer Tasks [63.016244188951696]
We propose a method for few-shot learning with fewer tasks, which is by metaulation. We modify parameters at various batch levels to increase the meta-training tasks. We also introduce learning variational feature hierarchies by incorporating the variationalulation.
arXiv Detail & Related papers (2023-05-17T15:47:47Z)
Fully Online Meta-Learning Without Task Boundaries [80.09124768759564]
We study how meta-learning can be applied to tackle online problems of this nature. We propose a Fully Online Meta-Learning (FOML) algorithm, which does not require any ground truth knowledge about the task boundaries. Our experiments show that FOML was able to learn new tasks faster than the state-of-the-art online learning methods.
arXiv Detail & Related papers (2022-02-01T07:51:24Z)
Meta Automatic Curriculum Learning [35.13646854355393]
We introduce the concept of Meta-ACL, and formalize it in the context of black-box RL learners. We present AGAIN, a first instantiation of Meta-ACL, and showcase its benefits for curriculum generation over classical ACL.
arXiv Detail & Related papers (2020-11-16T14:56:42Z)
Adaptive Task Sampling for Meta-Learning [79.61146834134459]
Key idea of meta-learning for few-shot classification is to mimic the few-shot situations faced at test time. We propose an adaptive task sampling method to improve the generalization performance.
arXiv Detail & Related papers (2020-07-17T03:15:53Z)
Task-similarity Aware Meta-learning through Nonparametric Kernel Regression [8.801367758434335]
This paper investigates the use of nonparametric kernel-regression to obtain a tasksimilarity aware meta-learning algorithm. Our hypothesis is that the use of tasksimilarity helps meta-learning when the available tasks are limited and may contain outlier/ dissimilar tasks.
arXiv Detail & Related papers (2020-06-12T14:15:11Z)
Learning to Learn to Disambiguate: Meta-Learning for Few-Shot Word Sense Disambiguation [26.296412053816233]
We propose a meta-learning framework for few-shot word sense disambiguation. The goal is to learn to disambiguate unseen words from only a few labeled instances. We extend several popular meta-learning approaches to this scenario, and analyze their strengths and weaknesses.
arXiv Detail & Related papers (2020-04-29T17:33:31Z)
Meta-Learning across Meta-Tasks for Few-Shot Learning [107.44950540552765]
We argue that the inter-meta-task relationships should be exploited and those tasks are sampled strategically to assist in meta-learning. We consider the relationships defined over two types of meta-task pairs and propose different strategies to exploit them.
arXiv Detail & Related papers (2020-02-11T09:25:13Z)
Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph. Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference. Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.