A Boolean Task Algebra for Reinforcement Learning
- URL: http://arxiv.org/abs/2001.01394v2
- Date: Thu, 15 Oct 2020 17:45:49 GMT
- Title: A Boolean Task Algebra for Reinforcement Learning
- Authors: Geraud Nangue Tasse, Steven James, Benjamin Rosman
- Abstract summary: We formalise the logical composition of tasks as a Boolean algebra.
We show that by learning goal-oriented value functions, an agent can solve new tasks with no further learning.
- Score: 14.731788603429774
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ability to compose learned skills to solve new tasks is an important
property of lifelong-learning agents. In this work, we formalise the logical
composition of tasks as a Boolean algebra. This allows us to formulate new
tasks in terms of the negation, disjunction and conjunction of a set of base
tasks. We then show that by learning goal-oriented value functions and
restricting the transition dynamics of the tasks, an agent can solve these new
tasks with no further learning. We prove that by composing these value
functions in specific ways, we immediately recover the optimal policies for all
tasks expressible under the Boolean algebra. We verify our approach in two
domains---including a high-dimensional video game environment requiring
function approximation---where an agent first learns a set of base skills, and
then composes them to solve a super-exponential number of new tasks.
Related papers
- Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - ConTinTin: Continual Learning from Task Instructions [101.36836925135091]
This work defines a new learning paradigm ConTinTin, in which a system should learn a sequence of new tasks one by one, each task is explained by a piece of textual instruction.
To our knowledge, this is the first time to study ConTinTin in NLP.
arXiv Detail & Related papers (2022-03-16T10:27:18Z) - Combining Modular Skills in Multitask Learning [149.8001096811708]
A modular design encourages neural models to disentangle and recombine different facets of knowledge to generalise more systematically to new tasks.
In this work, we assume each task is associated with a subset of latent discrete skills from a (potentially small) inventory.
We find that the modular design of a network significantly increases sample efficiency in reinforcement learning and few-shot generalisation in supervised learning.
arXiv Detail & Related papers (2022-02-28T16:07:19Z) - Learning to Follow Language Instructions with Compositional Policies [22.778677208048475]
We propose a framework that learns to execute natural language instructions in an environment consisting of goal-reaching tasks.
We train a reinforcement learning agent to learn value functions that can be subsequently composed through a Boolean algebra.
We fine-tune a seq2seq model pretrained on web-scale corpora to map language to logical expressions.
arXiv Detail & Related papers (2021-10-09T21:28:26Z) - Efficient and robust multi-task learning in the brain with modular task
primitives [2.6166087473624318]
We show that a modular network endowed with task primitives allows for learning multiple tasks well while keeping parameter counts, and updates, low.
We also show that the skills acquired with our approach are more robust to a broad range of perturbations compared to those acquired with other multi-task learning strategies.
arXiv Detail & Related papers (2021-05-28T21:07:54Z) - Latent Skill Planning for Exploration and Transfer [49.25525932162891]
In this paper, we investigate how these two approaches can be integrated into a single reinforcement learning agent.
We leverage the idea of partial amortization for fast adaptation at test time.
We demonstrate the benefits of our design decisions across a suite of challenging locomotion tasks.
arXiv Detail & Related papers (2020-11-27T18:40:03Z) - Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks.
In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other.
This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z) - Transforming task representations to perform novel tasks [12.008469282323492]
An important aspect of intelligence is the ability to adapt to a novel task without any direct experience (zero-shot)
We propose a general computational framework for adapting to novel tasks based on their relationship to prior tasks.
arXiv Detail & Related papers (2020-05-08T23:41:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.