Learning to Follow Language Instructions with Compositional Policies
- URL: http://arxiv.org/abs/2110.04647v1
- Date: Sat, 9 Oct 2021 21:28:26 GMT
- Title: Learning to Follow Language Instructions with Compositional Policies
- Authors: Vanya Cohen, Geraud Nangue Tasse, Nakul Gopalan, Steven James, Matthew
Gombolay, Benjamin Rosman
- Abstract summary: We propose a framework that learns to execute natural language instructions in an environment consisting of goal-reaching tasks.
We train a reinforcement learning agent to learn value functions that can be subsequently composed through a Boolean algebra.
We fine-tune a seq2seq model pretrained on web-scale corpora to map language to logical expressions.
- Score: 22.778677208048475
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a framework that learns to execute natural language instructions
in an environment consisting of goal-reaching tasks that share components of
their task descriptions. Our approach leverages the compositionality of both
value functions and language, with the aim of reducing the sample complexity of
learning novel tasks. First, we train a reinforcement learning agent to learn
value functions that can be subsequently composed through a Boolean algebra to
solve novel tasks. Second, we fine-tune a seq2seq model pretrained on web-scale
corpora to map language to logical expressions that specify the required value
function compositions. Evaluating our agent in the BabyAI domain, we observe a
decrease of 86% in the number of training steps needed to learn a second task
after mastering a single task. Results from ablation studies further indicate
that it is the combination of compositional value functions and language
representations that allows the agent to quickly generalize to new tasks.
Related papers
- Did You Read the Instructions? Rethinking the Effectiveness of Task
Definitions in Instruction Learning [74.70157466822612]
We systematically study the role of task definitions in instruction learning.
We find that model performance drops substantially when removing contents describing the task output.
We propose two strategies to help models better leverage task instructions.
arXiv Detail & Related papers (2023-06-01T21:11:24Z) - Language-guided Task Adaptation for Imitation Learning [40.1007184209417]
We introduce a novel setting, wherein an agent needs to learn a task from a demonstration of a related task with the difference between the tasks communicated in natural language.
The proposed setting allows reusing demonstrations from other tasks, by providing low effort language descriptions, and can also be used to provide feedback to correct agent errors.
arXiv Detail & Related papers (2023-01-24T00:56:43Z) - Coarse-to-Fine: Hierarchical Multi-task Learning for Natural Language
Understanding [51.31622274823167]
We propose a hierarchical framework with a coarse-to-fine paradigm, with the bottom level shared to all the tasks, the mid-level divided to different groups, and the top-level assigned to each of the tasks.
This allows our model to learn basic language properties from all tasks, boost performance on relevant tasks, and reduce the negative impact from irrelevant tasks.
arXiv Detail & Related papers (2022-08-19T02:46:20Z) - Compositional Generalization in Grounded Language Learning via Induced
Model Sparsity [81.38804205212425]
We consider simple language-conditioned navigation problems in a grid world environment with disentangled observations.
We design an agent that encourages sparse correlations between words in the instruction and attributes of objects, composing them together to find the goal.
Our agent maintains a high level of performance on goals containing novel combinations of properties even when learning from a handful of demonstrations.
arXiv Detail & Related papers (2022-07-06T08:46:27Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Combining Modular Skills in Multitask Learning [149.8001096811708]
A modular design encourages neural models to disentangle and recombine different facets of knowledge to generalise more systematically to new tasks.
In this work, we assume each task is associated with a subset of latent discrete skills from a (potentially small) inventory.
We find that the modular design of a network significantly increases sample efficiency in reinforcement learning and few-shot generalisation in supervised learning.
arXiv Detail & Related papers (2022-02-28T16:07:19Z) - Grad2Task: Improved Few-shot Text Classification Using Gradients for
Task Representation [24.488427641442694]
We propose a novel conditional neural process-based approach for few-shot text classification.
Our key idea is to represent each task using gradient information from a base model.
Our approach outperforms traditional fine-tuning, sequential transfer learning, and state-of-the-art meta learning approaches.
arXiv Detail & Related papers (2022-01-27T15:29:30Z) - ERICA: Improving Entity and Relation Understanding for Pre-trained
Language Models via Contrastive Learning [97.10875695679499]
We propose a novel contrastive learning framework named ERICA in pre-training phase to obtain a deeper understanding of the entities and their relations in text.
Experimental results demonstrate that our proposed ERICA framework achieves consistent improvements on several document-level language understanding tasks.
arXiv Detail & Related papers (2020-12-30T03:35:22Z) - A Boolean Task Algebra for Reinforcement Learning [14.731788603429774]
We formalise the logical composition of tasks as a Boolean algebra.
We show that by learning goal-oriented value functions, an agent can solve new tasks with no further learning.
arXiv Detail & Related papers (2020-01-06T04:46:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.