Related papers: Meta-Reinforcement Learning in Broad and Non-Parametric Environments

Meta-Reinforcement Learning in Broad and Non-Parametric Environments

URL: http://arxiv.org/abs/2108.03718v1
Date: Sun, 8 Aug 2021 19:32:44 GMT
Title: Meta-Reinforcement Learning in Broad and Non-Parametric Environments
Authors: Zhenshan Bing, Lukas Knak, Fabrice Oliver Robin, Kai Huang, Alois Knoll
Abstract summary: We introduce TIGR, a Task-Inference-based meta-RL algorithm for tasks in non-parametric environments. We decouple the policy training from the task-inference learning and efficiently train the inference mechanism on the basis of an unsupervised reconstruction objective. We provide a benchmark with qualitatively distinct tasks based on the half-cheetah environment and demonstrate the superior performance of TIGR compared to state-of-the-art meta-RL approaches.
Score: 8.091658684517103
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recent state-of-the-art artificial agents lack the ability to adapt rapidly to new tasks, as they are trained exclusively for specific objectives and require massive amounts of interaction to learn new skills. Meta-reinforcement learning (meta-RL) addresses this challenge by leveraging knowledge learned from training tasks to perform well in previously unseen tasks. However, current meta-RL approaches limit themselves to narrow parametric task distributions, ignoring qualitative differences between tasks that occur in the real world. In this paper, we introduce TIGR, a Task-Inference-based meta-RL algorithm using Gaussian mixture models (GMM) and gated Recurrent units, designed for tasks in non-parametric environments. We employ a generative model involving a GMM to capture the multi-modality of the tasks. We decouple the policy training from the task-inference learning and efficiently train the inference mechanism on the basis of an unsupervised reconstruction objective. We provide a benchmark with qualitatively distinct tasks based on the half-cheetah environment and demonstrate the superior performance of TIGR compared to state-of-the-art meta-RL approaches in terms of sample efficiency (3-10 times faster), asymptotic performance, and applicability in non-parametric environments with zero-shot adaptation.

Related papers

Sample-Efficient Neurosymbolic Deep Reinforcement Learning [49.60927398960061]
We propose a neuro-symbolic Deep RL approach that integrates background symbolic knowledge to improve sample efficiency.<n>Online reasoning is performed to guide the training process through two mechanisms.<n>We show improved performance over a state-of-the-art reward machine baseline.
arXiv Detail & Related papers (2026-01-06T09:28:53Z)
Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners [60.75160178669076]
We show that the use of high-capacity value models trained via cross-entropy and conditioned on learnable task embeddings addresses the problem of task interference in online reinforcement learning.<n>We test our approach on 7 multi-task benchmarks with over 280 unique tasks, spanning high degree-of-freedom humanoid control and discrete vision-based RL.
arXiv Detail & Related papers (2025-05-29T06:41:45Z)
UAS Visual Navigation in Large and Unseen Environments via a Meta Agent [0.13654846342364302]
We propose a meta-curriculum training scheme to efficiently learn to navigate in large-scale urban environments. We organize the training curriculum in a hierarchical manner such that the agent is guided from coarse to fine towards the target task. In contrast to traditional reinforcement learning (RL), which focuses on acquiring a policy for a specific task, MRL aims to learn a policy with fast transfer ability to novel tasks.
arXiv Detail & Related papers (2025-03-20T01:44:59Z)
Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data. For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z)
Meta-Reinforcement Learning Based on Self-Supervised Task Representation Learning [23.45043290237396]
MoSS is a context-based Meta-reinforcement learning algorithm based on Self-Supervised task representation learning. On MuJoCo and Meta-World benchmarks, MoSS outperforms prior in terms of performance, sample efficiency (3-50x faster), adaptation efficiency, and generalization.
arXiv Detail & Related papers (2023-04-29T15:46:19Z)
Efficient Meta Reinforcement Learning for Preference-based Fast Adaptation [17.165083095799712]
We study the problem of few-shot adaptation in the context of human-in-the-loop reinforcement learning. We develop a meta-RL algorithm that enables fast policy adaptation with preference-based feedback.
arXiv Detail & Related papers (2022-11-20T03:55:09Z)
Meta Reinforcement Learning with Successor Feature Based Context [51.35452583759734]
We propose a novel meta-RL approach that achieves competitive performance comparing to existing meta-RL algorithms. Our method does not only learn high-quality policies for multiple tasks simultaneously but also can quickly adapt to new tasks with a small amount of training.
arXiv Detail & Related papers (2022-07-29T14:52:47Z)
Learning Action Translator for Meta Reinforcement Learning on Sparse-Reward Tasks [56.63855534940827]
This work introduces a novel objective function to learn an action translator among training tasks. We theoretically verify that the value of the transferred policy with the action translator can be close to the value of the source policy. We propose to combine the action translator with context-based meta-RL algorithms for better data collection and more efficient exploration during meta-training.
arXiv Detail & Related papers (2022-07-19T04:58:06Z)
On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning [71.55412580325743]
We show that multi-task pretraining with fine-tuning on new tasks performs equally as well, or better, than meta-pretraining with meta test-time adaptation. This is encouraging for future research, as multi-task pretraining tends to be simpler and computationally cheaper than meta-RL.
arXiv Detail & Related papers (2022-06-07T13:24:00Z)
Model-based Adversarial Meta-Reinforcement Learning [38.28304764312512]
We propose Model-based Adversarial Meta-Reinforcement Learning (AdMRL) AdMRL aims to minimize the worst-case sub-optimality gap across all tasks in a family of tasks. We evaluate our approach on several continuous control benchmarks and demonstrate its efficacy in the worst-case performance over all tasks.
arXiv Detail & Related papers (2020-06-16T02:21:49Z)
MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration [52.48362697163477]
Experimental evaluation shows that our meta-RL method significantly outperforms state-of-the-art baselines on sparse-reward tasks. We model an exploration policy learning problem for meta-RL, which is separated from exploitation policy learning. We develop a new off-policy meta-RL framework, which efficiently learns separate context-aware exploration and exploitation policies.
arXiv Detail & Related papers (2020-06-15T06:56:18Z)
Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph. Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference. Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.