One-Shot Learning from a Demonstration with Hierarchical Latent Language
- URL: http://arxiv.org/abs/2203.04806v1
- Date: Wed, 9 Mar 2022 15:36:43 GMT
- Title: One-Shot Learning from a Demonstration with Hierarchical Latent Language
- Authors: Nathaniel Weir and Xingdi Yuan and Marc-Alexandre C\^ot\'e and Matthew
Hausknecht and Romain Laroche and Ida Momennejad and Harm Van Seijen and
Benjamin Van Durme
- Abstract summary: We introduce DescribeWorld, an environment designed to test this sort of generalization skill in grounded agents.
The agent observes a single task demonstration in a Minecraft-like grid world, and is then asked to carry out the same task in a new map.
We find that agents that perform text-based inference are better equipped for the challenge under a random split of tasks.
- Score: 43.140223608960554
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans have the capability, aided by the expressive compositionality of their
language, to learn quickly by demonstration. They are able to describe unseen
task-performing procedures and generalize their execution to other contexts. In
this work, we introduce DescribeWorld, an environment designed to test this
sort of generalization skill in grounded agents, where tasks are linguistically
and procedurally composed of elementary concepts. The agent observes a single
task demonstration in a Minecraft-like grid world, and is then asked to carry
out the same task in a new map. To enable such a level of generalization, we
propose a neural agent infused with hierarchical latent language--both at the
level of task inference and subtask planning. Our agent first generates a
textual description of the demonstrated unseen task, then leverages this
description to replicate it. Through multiple evaluation scenarios and a suite
of generalization tests, we find that agents that perform text-based inference
are better equipped for the challenge under a random split of tasks.
Related papers
- tagE: Enabling an Embodied Agent to Understand Human Instructions [3.943519623674811]
We introduce a novel system known as task and argument grounding for Embodied agents (tagE)
At its core, our system employs an inventive neural network model designed to extract a series of tasks from complex task instructions expressed in natural language.
Our proposed model adopts an encoder-decoder framework enriched with nested decoding to effectively extract tasks and their corresponding arguments from these intricate instructions.
arXiv Detail & Related papers (2023-10-24T08:17:48Z) - UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions [64.50935101415776]
We build a single model that jointly performs various spoken language understanding (SLU) tasks.
We demonstrate the efficacy of our single multi-task learning model "UniverSLU" for 12 speech classification and sequence generation task types spanning 17 datasets and 9 languages.
arXiv Detail & Related papers (2023-10-04T17:10:23Z) - Coarse-to-Fine: Hierarchical Multi-task Learning for Natural Language
Understanding [51.31622274823167]
We propose a hierarchical framework with a coarse-to-fine paradigm, with the bottom level shared to all the tasks, the mid-level divided to different groups, and the top-level assigned to each of the tasks.
This allows our model to learn basic language properties from all tasks, boost performance on relevant tasks, and reduce the negative impact from irrelevant tasks.
arXiv Detail & Related papers (2022-08-19T02:46:20Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Analyzing the Limits of Self-Supervision in Handling Bias in Language [52.26068057260399]
We evaluate how well language models capture the semantics of four tasks for bias: diagnosis, identification, extraction and rephrasing.
Our analyses indicate that language models are capable of performing these tasks to widely varying degrees across different bias dimensions, such as gender and political affiliation.
arXiv Detail & Related papers (2021-12-16T05:36:08Z) - Skill Induction and Planning with Latent Language [94.55783888325165]
We formulate a generative model of action sequences in which goals generate sequences of high-level subtask descriptions.
We describe how to train this model using primarily unannotated demonstrations by parsing demonstrations into sequences of named high-level subtasks.
In trained models, the space of natural language commands indexes a library of skills; agents can use these skills to plan by generating high-level instruction sequences tailored to novel goals.
arXiv Detail & Related papers (2021-10-04T15:36:32Z) - Visual-and-Language Navigation: A Survey and Taxonomy [1.0742675209112622]
This paper provides a comprehensive survey on Visual-and-Language Navigation (VLN) tasks.
According to when the instructions are given, the tasks can be divided into single-turn and multi-turn.
This taxonomy enable researchers to better grasp the key point of a specific task and identify directions for future research.
arXiv Detail & Related papers (2021-08-26T01:51:18Z) - Zero-shot Task Adaptation using Natural Language [43.807555235240365]
We propose a novel setting where an agent is given both a demonstration and a description.
Our approach is able to complete more than 95% of target tasks when using template-based descriptions.
arXiv Detail & Related papers (2021-06-05T21:39:04Z) - Ask Your Humans: Using Human Instructions to Improve Generalization in
Reinforcement Learning [32.82030512053361]
We propose the use of step-by-step human demonstrations in the form of natural language instructions and action trajectories.
We find that human demonstrations help solve the most complex tasks.
We also find that incorporating natural language allows the model to generalize to unseen tasks in a zero-shot setting.
arXiv Detail & Related papers (2020-11-01T14:39:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.