Related papers: Inverse Reinforcement Learning with Natural Language Goals

Inverse Reinforcement Learning with Natural Language Goals

URL: http://arxiv.org/abs/2008.06924v3
Date: Wed, 16 Dec 2020 04:40:17 GMT
Title: Inverse Reinforcement Learning with Natural Language Goals
Authors: Li Zhou and Kevin Small
Abstract summary: We propose a novel inverse reinforcement learning algorithm to learn a language-conditioned policy and reward function. Our algorithm outperforms multiple baselines by a large margin on a vision-based natural language instruction following dataset.
Score: 8.972202854038382
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Humans generally use natural language to communicate task requirements to each other. Ideally, natural language should also be usable for communicating goals to autonomous machines (e.g., robots) to minimize friction in task specification. However, understanding and mapping natural language goals to sequences of states and actions is challenging. Specifically, existing work along these lines has encountered difficulty in generalizing learned policies to new natural language goals and environments. In this paper, we propose a novel adversarial inverse reinforcement learning algorithm to learn a language-conditioned policy and reward function. To improve generalization of the learned policy and reward function, we use a variational goal generator to relabel trajectories and sample diverse goals during training. Our algorithm outperforms multiple baselines by a large margin on a vision-based natural language instruction following dataset (Room-2-Room), demonstrating a promising advance in enabling the use of natural language instructions in specifying agent goals.

Related papers

Policy Learning with a Language Bottleneck [65.99843627646018]
Policy Learning with a Language Bottleneck (PLLBB) is a framework enabling AI agents to generate linguistic rules. PLLBB alternates between a rule generation step guided by language models, and an update step where agents learn new policies guided by rules. In a two-player communication game, a maze solving task, and two image reconstruction tasks, we show thatPLLBB agents are not only able to learn more interpretable and generalizable behaviors, but can also share the learned rules with human users.
arXiv Detail & Related papers (2024-05-07T08:40:21Z)
Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control [58.06223121654735]
We show a method that taps into joint image- and goal- conditioned policies with language using only a small amount of language data. Our method achieves robust performance in the real world by learning an embedding from the labeled data that aligns language not to the goal image. We show instruction following across a variety of manipulation tasks in different scenes, with generalization to language instructions outside of the labeled data.
arXiv Detail & Related papers (2023-06-30T20:09:39Z)
ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes [72.83187997344406]
ARNOLD is a benchmark that evaluates language-grounded task learning with continuous states in realistic 3D scenes. ARNOLD is comprised of 8 language-conditioned tasks that involve understanding object states and learning policies for continuous goals.
arXiv Detail & Related papers (2023-04-09T21:42:57Z)
PADL: Language-Directed Physics-Based Character Control [66.517142635815]
We present PADL, which allows users to issue natural language commands for specifying high-level tasks and low-level skills that a character should perform. We show that our framework can be applied to effectively direct a simulated humanoid character to perform a diverse array of complex motor skills.
arXiv Detail & Related papers (2023-01-31T18:59:22Z)
Pre-Trained Language Models for Interactive Decision-Making [72.77825666035203]
We describe a framework for imitation learning in which goals and observations are represented as a sequence of embeddings. We demonstrate that this framework enables effective generalization across different environments. For test tasks involving novel goals or novel scenes, initializing policies with language models improves task completion rates by 43.6%.
arXiv Detail & Related papers (2022-02-03T18:55:52Z)
CALVIN: A Benchmark for Language-conditioned Policy Learning for Long-horizon Robot Manipulation Tasks [30.936692970187416]
General-purpose robots must learn to relate human language to their perceptions and actions. We present CALVIN, an open-source simulated benchmark to learn long-horizon language-conditioned tasks.
arXiv Detail & Related papers (2021-12-06T18:37:33Z)
Learning Language-Conditioned Robot Behavior from Offline Data and Crowd-Sourced Annotation [80.29069988090912]
We study the problem of learning a range of vision-based manipulation tasks from a large offline dataset of robot interaction. We propose to leverage offline robot datasets with crowd-sourced natural language labels. We find that our approach outperforms both goal-image specifications and language conditioned imitation techniques by more than 25%.
arXiv Detail & Related papers (2021-09-02T17:42:13Z)
Ask Your Humans: Using Human Instructions to Improve Generalization in Reinforcement Learning [32.82030512053361]
We propose the use of step-by-step human demonstrations in the form of natural language instructions and action trajectories. We find that human demonstrations help solve the most complex tasks. We also find that incorporating natural language allows the model to generalize to unseen tasks in a zero-shot setting.
arXiv Detail & Related papers (2020-11-01T14:39:46Z)
PixL2R: Guiding Reinforcement Learning Using Natural Language by Mapping Pixels to Rewards [40.1007184209417]
We propose a model that maps pixels to rewards, given a free-form natural language description of the task. Experiments on the Meta-World robot manipulation domain show that language-based rewards significantly improves the sample efficiency of policy learning.
arXiv Detail & Related papers (2020-07-30T15:50:38Z)
Language-Conditioned Goal Generation: a New Approach to Language Grounding for RL [23.327749767424567]
In the real world, linguistic agents are also embodied agents: they perceive and act in the physical world. This paper proposes using language to condition goal generators. Given any goal-conditioned policy, one could train a language-conditioned goal generator to generate language-agnostic goals for the agent.
arXiv Detail & Related papers (2020-06-12T09:54:38Z)
Language Conditioned Imitation Learning over Unstructured Data [9.69886122332044]
We present a method for incorporating free-form natural language conditioning into imitation learning. Our approach learns perception from pixels, natural language understanding, and multitask continuous control end-to-end as a single neural network. We show this dramatically improves language conditioned performance, while reducing the cost of language annotation to less than 1% of total data.
arXiv Detail & Related papers (2020-05-15T17:08:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.