Related papers: Procedural Generalization by Planning with Self-Supervised World Models

Procedural Generalization by Planning with Self-Supervised World Models

URL: http://arxiv.org/abs/2111.01587v1
Date: Tue, 2 Nov 2021 13:32:21 GMT
Title: Procedural Generalization by Planning with Self-Supervised World Models
Authors: Ankesh Anand, Jacob Walker, Yazhe Li, Eszter V\'ertes, Julian Schrittwieser, Sherjil Ozair, Th\'eophane Weber, Jessica B. Hamrick
Abstract summary: We measure the generalization ability of model-based agents in comparison to their model-free counterparts. We identify three factors of procedural generalization -- planning, self-supervised representation learning, and procedural data diversity. We find that these factors do not always provide the same benefits for the task generalization.
Score: 10.119257232716834
License: http://creativecommons.org/licenses/by/4.0/
Abstract: One of the key promises of model-based reinforcement learning is the ability to generalize using an internal model of the world to make predictions in novel environments and tasks. However, the generalization ability of model-based agents is not well understood because existing work has focused on model-free agents when benchmarking generalization. Here, we explicitly measure the generalization ability of model-based agents in comparison to their model-free counterparts. We focus our analysis on MuZero (Schrittwieser et al., 2020), a powerful model-based agent, and evaluate its performance on both procedural and task generalization. We identify three factors of procedural generalization -- planning, self-supervised representation learning, and procedural data diversity -- and show that by combining these techniques, we achieve state-of-the art generalization performance and data efficiency on Procgen (Cobbe et al., 2019). However, we find that these factors do not always provide the same benefits for the task generalization benchmarks in Meta-World (Yu et al., 2019), indicating that transfer remains a challenge and may require different approaches than procedural generalization. Overall, we suggest that building generalizable agents requires moving beyond the single-task, model-free paradigm and towards self-supervised model-based agents that are trained in rich, procedural, multi-task environments.

Related papers

The Science of Evaluating Foundation Models [46.973855710909746]
This work focuses on three key aspects: (1) Formalizing the Evaluation Process by providing a structured framework tailored to specific use-case contexts; (2) Offering Actionable Tools and Frameworks such as checklists and templates to ensure thorough, reproducible, and practical evaluations; and (3) Surveying Recent Work with a targeted review of advancements in LLM evaluation, emphasizing real-world applications.
arXiv Detail & Related papers (2025-02-12T22:55:43Z)
On the Modeling Capabilities of Large Language Models for Sequential Decision Making [52.128546842746246]
Large pretrained models are showing increasingly better performance in reasoning and planning tasks. We evaluate their ability to produce decision-making policies, either directly, by generating actions, or indirectly. In environments with unfamiliar dynamics, we explore how fine-tuning LLMs with synthetic data can significantly improve their reward modeling capabilities.
arXiv Detail & Related papers (2024-10-08T03:12:57Z)
Toward Universal and Interpretable World Models for Open-ended Learning Agents [0.0]
We introduce a generic, compositional and interpretable class of generative world models that supports open-ended learning agents. This is a sparse class of Bayesian networks capable of approximating a broad range of processes, which provide agents with the ability to learn world models in a manner that may be both interpretable and computationally scalable.
arXiv Detail & Related papers (2024-09-27T12:03:15Z)
Investigating the Role of Instruction Variety and Task Difficulty in Robotic Manipulation Tasks [50.75902473813379]
This work introduces a comprehensive evaluation framework that systematically examines the role of instructions and inputs in the generalisation abilities of such models. The proposed framework uncovers the resilience of multimodal models to extreme instruction perturbations and their vulnerability to observational changes.
arXiv Detail & Related papers (2024-07-04T14:36:49Z)
Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems. Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored. We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z)
Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs [63.936622239286685]
We find that interference among different tasks and modalities is the main factor to this phenomenon. We introduce the Conditional Mixture-of-Experts (Conditional MoEs) to generalist models. Code and pre-trained generalist models shall be released.
arXiv Detail & Related papers (2022-06-09T17:59:59Z)
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities [76.97949110580703]
We introduce SUPERB-SG, a new benchmark to evaluate pre-trained models across various speech tasks. We use a lightweight methodology to test the robustness of representations learned by pre-trained models under shifts in data domain. We also show that the task diversity of SUPERB-SG coupled with limited task supervision is an effective recipe for evaluating the generalizability of model representation.
arXiv Detail & Related papers (2022-03-14T04:26:40Z)
Leveraging Approximate Symbolic Models for Reinforcement Learning via Skill Diversity [32.35693772984721]
We introduce Symbolic-Model Guided Reinforcement Learning, wherein we will formalize the relationship between the symbolic model and the underlying MDP. We will use these models to extract high-level landmarks that will be used to decompose the task. At the low level, we learn a set of diverse policies for each possible task sub-goal identified by the landmark.
arXiv Detail & Related papers (2022-02-06T23:20:30Z)
A Self-Supervised Framework for Function Learning and Extrapolation [1.9374999427973014]
We present a framework for how a learner may acquire representations that support generalization. We show the resulting representations outperform those from other models for unsupervised time series learning.
arXiv Detail & Related papers (2021-06-14T12:41:03Z)
Robustness to Augmentations as a Generalization metric [0.0]
Generalization is the ability of a model to predict on unseen domains. We propose a method to predict the generalization performance of a model by using the concept that models that are robust to augmentations are more generalizable than those which are not. The proposed method was the first runner up solution for the NeurIPS competition on Predicting Generalization in Deep Learning.
arXiv Detail & Related papers (2021-01-16T15:36:38Z)
Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy. We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space. We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.