Human Instruction-Following with Deep Reinforcement Learning via
Transfer-Learning from Text
- URL: http://arxiv.org/abs/2005.09382v1
- Date: Tue, 19 May 2020 12:16:58 GMT
- Title: Human Instruction-Following with Deep Reinforcement Learning via
Transfer-Learning from Text
- Authors: Felix Hill, Sona Mokra, Nathaniel Wong, Tim Harley
- Abstract summary: Recent work has described neural-network-based agents that are trained with reinforcement learning to execute language-like commands in simulated worlds.
We propose a conceptually simple method for training instruction-following agents with deep RL that are robust to natural human instructions.
- Score: 12.88819706338837
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent work has described neural-network-based agents that are trained with
reinforcement learning (RL) to execute language-like commands in simulated
worlds, as a step towards an intelligent agent or robot that can be instructed
by human users. However, the optimisation of multi-goal motor policies via deep
RL from scratch requires many episodes of experience. Consequently,
instruction-following with deep RL typically involves language generated from
templates (by an environment simulator), which does not reflect the varied or
ambiguous expressions of real users. Here, we propose a conceptually simple
method for training instruction-following agents with deep RL that are robust
to natural human instructions. By applying our method with a state-of-the-art
pre-trained text-based language model (BERT), on tasks requiring agents to
identify and position everyday objects relative to other objects in a
naturalistic 3D simulated room, we demonstrate substantially-above-chance
zero-shot transfer from synthetic template commands to natural instructions
given by humans. Our approach is a general recipe for training any deep
RL-based system to interface with human users, and bridges the gap between two
research directions of notable recent success: agent-centric motor behavior and
text-based representation learning.
Related papers
- Symbolic Learning Enables Self-Evolving Agents [55.625275970720374]
We introduce agent symbolic learning, a systematic framework that enables language agents to optimize themselves on their own.
Agent symbolic learning is designed to optimize the symbolic network within language agents by mimicking two fundamental algorithms in connectionist learning.
We conduct proof-of-concept experiments on both standard benchmarks and complex real-world tasks.
arXiv Detail & Related papers (2024-06-26T17:59:18Z) - Interpretable Robotic Manipulation from Language [11.207620790833271]
We introduce an explainable behavior cloning agent, named Ex-PERACT, specifically designed for manipulation tasks.
At the top level, the model is tasked with learning a discrete skill code, while at the bottom level, the policy network translates the problem into a voxelized grid and maps the discretized actions to voxel grids.
We evaluate our method across eight challenging manipulation tasks utilizing the RLBench benchmark, demonstrating that Ex-PERACT not only achieves competitive policy performance but also effectively bridges the gap between human instructions and machine execution in complex environments.
arXiv Detail & Related papers (2024-05-27T11:02:21Z) - Policy Learning with a Language Bottleneck [65.99843627646018]
Policy Learning with a Language Bottleneck (PLLBB) is a framework enabling AI agents to generate linguistic rules.
PLLBB alternates between a rule generation step guided by language models, and an update step where agents learn new policies guided by rules.
In a two-player communication game, a maze solving task, and two image reconstruction tasks, we show thatPLLBB agents are not only able to learn more interpretable and generalizable behaviors, but can also share the learned rules with human users.
arXiv Detail & Related papers (2024-05-07T08:40:21Z) - Vision-Language Models Provide Promptable Representations for Reinforcement Learning [67.40524195671479]
We propose a novel approach that uses the vast amounts of general and indexable world knowledge encoded in vision-language models (VLMs) pre-trained on Internet-scale data for embodied reinforcement learning (RL)
We show that our approach can use chain-of-thought prompting to produce representations of common-sense semantic reasoning, improving policy performance in novel scenes by 1.5 times.
arXiv Detail & Related papers (2024-02-05T00:48:56Z) - RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic
Control [140.48218261864153]
We study how vision-language models trained on Internet-scale data can be incorporated directly into end-to-end robotic control.
Our approach leads to performant robotic policies and enables RT-2 to obtain a range of emergent capabilities from Internet-scale training.
arXiv Detail & Related papers (2023-07-28T21:18:02Z) - Towards A Unified Agent with Foundation Models [18.558328028366816]
We investigate how to embed and leverage such abilities in Reinforcement Learning (RL) agents.
We design a framework that uses language as the core reasoning tool, exploring how this enables an agent to tackle a series of fundamental RL challenges.
We demonstrate substantial performance improvements over baselines in exploration efficiency and ability to reuse data from offline datasets.
arXiv Detail & Related papers (2023-07-18T22:37:30Z) - Using Natural Language and Program Abstractions to Instill Human
Inductive Biases in Machines [27.79626958016208]
We show that agents trained by meta-learning may acquire very different strategies from humans.
We show that co-training these agents on predicting representations from natural language task descriptions and from programs induced to generate such tasks guides them toward human-like inductive biases.
arXiv Detail & Related papers (2022-05-23T18:17:58Z) - Deep Reinforcement Learning with Interactive Feedback in a Human-Robot
Environment [1.2998475032187096]
We propose a deep reinforcement learning approach with interactive feedback to learn a domestic task in a human-robot scenario.
We compare three different learning methods using a simulated robotic arm for the task of organizing different objects.
The obtained results show that a learner agent, using either agent-IDeepRL or human-IDeepRL, completes the given task earlier and has fewer mistakes compared to the autonomous DeepRL approach.
arXiv Detail & Related papers (2020-07-07T11:55:27Z) - RL-CycleGAN: Reinforcement Learning Aware Simulation-To-Real [74.45688231140689]
We introduce the RL-scene consistency loss for image translation, which ensures that the translation operation is invariant with respect to the Q-values associated with the image.
We obtain RL-CycleGAN, a new approach for simulation-to-real-world transfer for reinforcement learning.
arXiv Detail & Related papers (2020-06-16T08:58:07Z) - On the interaction between supervision and self-play in emergent
communication [82.290338507106]
We investigate the relationship between two categories of learning signals with the ultimate goal of improving sample efficiency.
We find that first training agents via supervised learning on human data followed by self-play outperforms the converse.
arXiv Detail & Related papers (2020-02-04T02:35:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.