Related papers: Environment Generation for Zero-Shot Compositional Reinforcement Learning

Environment Generation for Zero-Shot Compositional Reinforcement Learning

URL: http://arxiv.org/abs/2201.08896v1
Date: Fri, 21 Jan 2022 21:35:01 GMT
Title: Environment Generation for Zero-Shot Compositional Reinforcement Learning
Authors: Izzeddin Gur, Natasha Jaques, Yingjie Miao, Jongwook Choi, Manoj Tiwari, Honglak Lee, Aleksandra Faust
Abstract summary: Compositional Design of Environments (CoDE) trains a Generator agent to automatically build a series of compositional tasks tailored to the agent's current skill level. We learn to generate environments composed of multiple pages or rooms, and train RL agents capable of completing wide-range of complex tasks in those environments. CoDE yields 4x higher success rate than the strongest baseline, and demonstrates strong performance of real websites learned on 3500 primitive tasks.
Score: 105.35258025210862
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Many real-world problems are compositional - solving them requires completing interdependent sub-tasks, either in series or in parallel, that can be represented as a dependency graph. Deep reinforcement learning (RL) agents often struggle to learn such complex tasks due to the long time horizons and sparse rewards. To address this problem, we present Compositional Design of Environments (CoDE), which trains a Generator agent to automatically build a series of compositional tasks tailored to the RL agent's current skill level. This automatic curriculum not only enables the agent to learn more complex tasks than it could have otherwise, but also selects tasks where the agent's performance is weak, enhancing its robustness and ability to generalize zero-shot to unseen tasks at test-time. We analyze why current environment generation techniques are insufficient for the problem of generating compositional tasks, and propose a new algorithm that addresses these issues. Our results assess learning and generalization across multiple compositional tasks, including the real-world problem of learning to navigate and interact with web pages. We learn to generate environments composed of multiple pages or rooms, and train RL agents capable of completing wide-range of complex tasks in those environments. We contribute two new benchmark frameworks for generating compositional tasks, compositional MiniGrid and gMiniWoB for web navigation.CoDE yields 4x higher success rate than the strongest baseline, and demonstrates strong performance of real websites learned on 3500 primitive tasks.

Related papers

TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation [45.028795422801764]
We propose a multi-agent framework based on dynamic Task Decomposition and Agent Generation (TDAG) This framework dynamically decomposes complex tasks into smaller subtasks and assigns each to a specifically generated subagent. ItineraryBench is designed to assess agents' abilities in memory, planning, and tool usage across tasks of varying complexity.
arXiv Detail & Related papers (2024-02-15T18:27:37Z)
WebArena: A Realistic Web Environment for Building Autonomous Agents [92.3291458543633]
We build an environment for language-guided agents that is highly realistic and reproducible. We focus on agents that perform tasks on the web, and create an environment with fully functional websites from four common domains. We release a set of benchmark tasks focusing on evaluating the functional correctness of task completions.
arXiv Detail & Related papers (2023-07-25T22:59:32Z)
Fast Inference and Transfer of Compositional Task Structures for Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph. Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks. Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z)
Multi-Task Learning with Sequence-Conditioned Transporter Networks [67.57293592529517]
We aim to solve multi-task learning through the lens of sequence-conditioning and weighted sampling. We propose a new suite of benchmark aimed at compositional tasks, MultiRavens, which allows defining custom task combinations. Second, we propose a vision-based end-to-end system architecture, Sequence-Conditioned Transporter Networks, which augments Goal-Conditioned Transporter Networks with sequence-conditioning and weighted sampling.
arXiv Detail & Related papers (2021-09-15T21:19:11Z)
One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks [36.07011014271394]
We show that a single neural network is capable of simultaneously learning multiple tasks from a combined data set. We study how the complexity of learning such combined tasks grows with the complexity of the task codes.
arXiv Detail & Related papers (2021-03-29T01:16:42Z)
Adversarial Environment Generation for Learning to Navigate the Web [107.99759923626242]
One of the bottlenecks of training web navigation agents is providing a learnable curriculum of training environments. We propose using Adversarial Environment Generation (AEG) to generate challenging web environments in which to train reinforcement learning (RL) agents. We show that the navigator agent trained with our proposed Flexible b-PAIRED technique significantly outperforms competitive automatic curriculum generation baselines.
arXiv Detail & Related papers (2021-03-02T19:19:30Z)
Meta Automatic Curriculum Learning [35.13646854355393]
We introduce the concept of Meta-ACL, and formalize it in the context of black-box RL learners. We present AGAIN, a first instantiation of Meta-ACL, and showcase its benefits for curriculum generation over classical ACL.
arXiv Detail & Related papers (2020-11-16T14:56:42Z)
CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning [138.40338621974954]
CausalWorld is a benchmark for causal structure and transfer learning in a robotic manipulation environment. Tasks consist of constructing 3D shapes from a given set of blocks - inspired by how children learn to build complex structures.
arXiv Detail & Related papers (2020-10-08T23:01:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.