Contextual Latent World Models for Offline Meta Reinforcement Learning
- URL: http://arxiv.org/abs/2603.02935v1
- Date: Tue, 03 Mar 2026 12:45:20 GMT
- Title: Contextual Latent World Models for Offline Meta Reinforcement Learning
- Authors: Mohammadreza Nakheai, Aidan Scannell, Kevin Luck, Joni Pajarinen,
- Abstract summary: We introduce contextual latent world models, which condition latent world models on inferred task representations and train them jointly with the context encoder.<n>This enforces task-conditioned temporal consistency, yielding task representations that capture task-dependent dynamics.<n>Our method learns more expressive task representations and significantly improves generalization to unseen tasks across MuJoCo, Contextual-DeepMind Control, and Meta-World benchmarks.
- Score: 17.917947576971816
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Offline meta-reinforcement learning seeks to learn policies that generalize across related tasks from fixed datasets. Context-based methods infer a task representation from transition histories, but learning effective task representations without supervision remains a challenge. In parallel, latent world models have demonstrated strong self-supervised representation learning through temporal consistency. We introduce contextual latent world models, which condition latent world models on inferred task representations and train them jointly with the context encoder. This enforces task-conditioned temporal consistency, yielding task representations that capture task-dependent dynamics rather than merely discriminating between tasks. Our method learns more expressive task representations and significantly improves generalization to unseen tasks across MuJoCo, Contextual-DeepMind Control, and Meta-World benchmarks.
Related papers
- VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning [68.98988753763666]
We propose VisualCloze, a universal image generation framework.<n>VisualCloze supports a wide range of in-domain tasks, generalization to unseen ones, unseen unification of multiple tasks, and reverse generation.<n>We introduce Graph200K, a graph-structured dataset that establishes various interrelated tasks, enhancing task density and transferable knowledge.
arXiv Detail & Related papers (2025-04-10T17:59:42Z) - Learning Task Representations from In-Context Learning [67.66042137487287]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning (ICL)<n>We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads.<n>The proposed method successfully extracts task-specific information from in-context demonstrations and excels in both text and regression tasks.
arXiv Detail & Related papers (2025-02-08T00:16:44Z) - Task-Aware Virtual Training: Enhancing Generalization in Meta-Reinforcement Learning for Out-of-Distribution Tasks [4.374837991804085]
Task-Aware Virtual Training (TAVT) is a novel algorithm that captures task characteristics for both training and out-of-distribution (OOD) scenarios.<n> Numerical results demonstrate that TAVT significantly enhances generalization to OOD tasks across various MuJoCo and MetaWorld environments.
arXiv Detail & Related papers (2025-02-05T02:31:50Z) - Entropy Regularized Task Representation Learning for Offline Meta-Reinforcement Learning [12.443661471796595]
offline meta-reinforcement learning aims to equip agents with the ability to rapidly adapt to new tasks by training on data from a set of different tasks.<n> Context-based approaches utilize a history of state-action-reward transitions to infer representations of the current task, and then condition the agent, i.e., the policy and value function, on the task representations.<n>Unfortunately, context-based approaches suffer from distribution mismatch, as the context in the offline data does not match the context at test time.
arXiv Detail & Related papers (2024-12-19T13:24:01Z) - Flex: End-to-End Text-Instructed Visual Navigation from Foundation Model Features [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies.<n>Our findings are synthesized in Flex (Fly lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors.<n>We demonstrate the effectiveness of this approach on a quadrotor fly-to-target task, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z) - Unsupervised Meta-Learning via In-Context Learning [3.4165401459803335]
We propose a novel approach to unsupervised meta-learning that leverages the generalization abilities of in-supervised learning.<n>Our method reframes meta-learning as a sequence modeling problem, enabling the transformer encoder to learn task context from support images.
arXiv Detail & Related papers (2024-05-25T08:29:46Z) - One-shot Imitation in a Non-Stationary Environment via Multi-Modal Skill [6.294766893350108]
We present a skill-based imitation learning framework enabling one-shot imitation and zero-shot adaptation.
We leverage a vision-language model to learn a semantic skill set from offline video datasets.
We evaluate our framework with various one-shot imitation scenarios for extended multi-stage Meta-world tasks.
arXiv Detail & Related papers (2024-02-13T11:01:52Z) - Task Aware Dreamer for Task Generalization in Reinforcement Learning [31.364276322513447]
We show that training a general world model can utilize similar structures in tasks and help train more generalizable agents.<n>We introduce a novel method named Task Aware Dreamer (TAD), which integrates reward-informed features to identify latent characteristics across tasks.<n>Experiments in both image-based and state-based tasks show that TAD can significantly improve the performance of handling different tasks simultaneously.
arXiv Detail & Related papers (2023-03-09T08:04:16Z) - Unsupervised Task Graph Generation from Instructional Video Transcripts [53.54435048879365]
We consider a setting where text transcripts of instructional videos performing a real-world activity are provided.
The goal is to identify the key steps relevant to the task as well as the dependency relationship between these key steps.
We propose a novel task graph generation approach that combines the reasoning capabilities of instruction-tuned language models along with clustering and ranking components.
arXiv Detail & Related papers (2023-02-17T22:50:08Z) - Coarse-to-Fine: Hierarchical Multi-task Learning for Natural Language
Understanding [51.31622274823167]
We propose a hierarchical framework with a coarse-to-fine paradigm, with the bottom level shared to all the tasks, the mid-level divided to different groups, and the top-level assigned to each of the tasks.
This allows our model to learn basic language properties from all tasks, boost performance on relevant tasks, and reduce the negative impact from irrelevant tasks.
arXiv Detail & Related papers (2022-08-19T02:46:20Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.