Feudal Reinforcement Learning by Reading Manuals
- URL: http://arxiv.org/abs/2110.06477v1
- Date: Wed, 13 Oct 2021 03:50:15 GMT
- Title: Feudal Reinforcement Learning by Reading Manuals
- Authors: Kai Wang, Zhonghao Wang, Mo Yu, Humphrey Shi
- Abstract summary: We present a Feudal Reinforcement Learning model consisting of a manager agent and a worker agent.
Our model effectively alleviates the mismatching between text-level inference and low-level perceptions and actions.
- Score: 23.19226806839748
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reading to act is a prevalent but challenging task which requires the ability
to reason from a concise instruction. However, previous works face the semantic
mismatch between the low-level actions and the high-level language descriptions
and require the human-designed curriculum to work properly. In this paper, we
present a Feudal Reinforcement Learning (FRL) model consisting of a manager
agent and a worker agent. The manager agent is a multi-hop plan generator
dealing with high-level abstract information and generating a series of
sub-goals in a backward manner. The worker agent deals with the low-level
perceptions and actions to achieve the sub-goals one by one. In comparison, our
FRL model effectively alleviate the mismatching between text-level inference
and low-level perceptions and actions; and is general to various forms of
environments, instructions and manuals; and our multi-hop plan generator can
significantly boost for challenging tasks where multi-step reasoning form the
texts is critical to resolve the instructed goals. We showcase our approach
achieves competitive performance on two challenging tasks, Read to Fight
Monsters (RTFM) and Messenger, without human-designed curriculum learning.
Related papers
- Large Language Model as a Policy Teacher for Training Reinforcement Learning Agents [16.24662355253529]
Large Language Models (LLMs) can address sequential decision-making tasks through the provision of high-level instructions.
LLMs lack specialization in tackling specific target problems, particularly in real-time dynamic environments.
We introduce a novel framework that addresses these challenges by training a smaller, specialized student RL agent using instructions from an LLM-based teacher agent.
arXiv Detail & Related papers (2023-11-22T13:15:42Z) - Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning [49.92517970237088]
We tackle the problem of training a robot to understand multimodal prompts.
This type of task poses a major challenge to robots' capability to understand the interconnection and complementarity between vision and language signals.
We introduce an effective framework that learns a policy to perform robot manipulation with multimodal prompts.
arXiv Detail & Related papers (2023-10-14T22:24:58Z) - LARG, Language-based Automatic Reward and Goal Generation [8.404316955848602]
We develop an approach that converts a text-based task description into its corresponding reward and goal-generation functions.
We evaluate our approach for robotic manipulation and demonstrate its ability to train and execute policies in a scalable manner.
arXiv Detail & Related papers (2023-06-19T14:52:39Z) - Plan, Eliminate, and Track -- Language Models are Good Teachers for
Embodied Agents [99.17668730578586]
Pre-trained large language models (LLMs) capture procedural knowledge about the world.
Plan, Eliminate, and Track (PET) framework translates a task description into a list of high-level sub-tasks.
PET framework leads to a significant 15% improvement over SOTA for generalization to human goal specifications.
arXiv Detail & Related papers (2023-05-03T20:11:22Z) - Collaborating with language models for embodied reasoning [30.82976922056617]
Reasoning in a complex and ambiguous environment is a key goal for Reinforcement Learning (RL) agents.
We present a set of tasks that require reasoning, test this system's ability to generalize zero-shot and investigate failure cases.
arXiv Detail & Related papers (2023-02-01T21:26:32Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - CINS: Comprehensive Instruction for Few-shot Learning in Task-oriented
Dialog Systems [56.302581679816775]
This paper proposes Comprehensive Instruction (CINS) that exploits PLMs with task-specific instructions.
We design a schema (definition, constraint, prompt) of instructions and their customized realizations for three important downstream tasks in ToD.
Experiments are conducted on these ToD tasks in realistic few-shot learning scenarios with small validation data.
arXiv Detail & Related papers (2021-09-10T03:23:06Z) - Multitasking Inhibits Semantic Drift [46.71462510028727]
We study the dynamics of learning in latent language policies (LLPs)
LLPs can solve challenging long-horizon reinforcement learning problems.
Previous work has found that LLP training is prone to semantic drift.
arXiv Detail & Related papers (2021-04-15T03:42:17Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.