Modular Lifelong Reinforcement Learning via Neural Composition
- URL: http://arxiv.org/abs/2207.00429v1
- Date: Fri, 1 Jul 2022 13:48:29 GMT
- Title: Modular Lifelong Reinforcement Learning via Neural Composition
- Authors: Jorge A. Mendez and Harm van Seijen and Eric Eaton
- Abstract summary: Humans commonly solve complex problems by decomposing them into easier subproblems and then combining the subproblem solutions.
This type of compositional reasoning permits reuse of the subproblem solutions when tackling future tasks that share part of the underlying compositional structure.
In a continual or lifelong reinforcement learning (RL) setting, this ability to decompose knowledge into reusable components would enable agents to quickly learn new RL tasks.
- Score: 31.561979764372886
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Humans commonly solve complex problems by decomposing them into easier
subproblems and then combining the subproblem solutions. This type of
compositional reasoning permits reuse of the subproblem solutions when tackling
future tasks that share part of the underlying compositional structure. In a
continual or lifelong reinforcement learning (RL) setting, this ability to
decompose knowledge into reusable components would enable agents to quickly
learn new RL tasks by leveraging accumulated compositional structures. We
explore a particular form of composition based on neural modules and present a
set of RL problems that intuitively admit compositional solutions. Empirically,
we demonstrate that neural composition indeed captures the underlying structure
of this space of problems. We further propose a compositional lifelong RL
method that leverages accumulated neural components to accelerate the learning
of future tasks while retaining performance on previous tasks via off-line RL
over replayed experiences.
Related papers
- Prioritized Soft Q-Decomposition for Lexicographic Reinforcement Learning [1.8399318639816038]
We propose prioritized soft Q-decomposition (PSQD) for learning and adapting subtask solutions under lexicographic priorities.
PSQD offers the ability to reuse previously learned subtask solutions in a zero-shot composition, followed by an adaptation step.
We demonstrate the efficacy of our approach by presenting successful learning, reuse, and adaptation results for both low- and high-dimensional simulated robot control tasks.
arXiv Detail & Related papers (2023-10-03T18:36:21Z) - Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models [68.18370230899102]
We investigate how to elicit compositional generalization capabilities in large language models (LLMs)
We find that demonstrating both foundational skills and compositional examples grounded in these skills within the same prompt context is crucial.
We show that fine-tuning LLMs with SKiC-style data can elicit zero-shot weak-to-strong generalization.
arXiv Detail & Related papers (2023-08-01T05:54:12Z) - Faith and Fate: Limits of Transformers on Compositionality [109.79516190693415]
We investigate the limits of transformer large language models across three representative compositional tasks.
These tasks require breaking problems down into sub-steps and synthesizing these steps into a precise answer.
Our empirical findings suggest that transformer LLMs solve compositional tasks by reducing multi-step compositional reasoning into linearized subgraph matching.
arXiv Detail & Related papers (2023-05-29T23:24:14Z) - Lifelong Reinforcement Learning with Modulating Masks [16.24639836636365]
Lifelong learning aims to create AI systems that continuously and incrementally learn during a lifetime, similar to biological learning.
Attempts so far have met problems, including catastrophic forgetting, interference among tasks, and the inability to exploit previous knowledge.
We show that lifelong reinforcement learning with modulating masks is a promising approach to lifelong learning, to the composition of knowledge to learn increasingly complex tasks, and to knowledge reuse for efficient and faster learning.
arXiv Detail & Related papers (2022-12-21T15:49:20Z) - Utilizing Prior Solutions for Reward Shaping and Composition in
Entropy-Regularized Reinforcement Learning [3.058685580689605]
We develop a general framework for reward shaping and task composition in entropy-regularized RL.
We show how the derived relation leads to a general result for reward shaping in entropy-regularized RL.
We then generalize this approach to derive an exact relation connecting optimal value functions for the composition of multiple tasks in entropy-regularized RL.
arXiv Detail & Related papers (2022-12-02T13:57:53Z) - Lifelong Machine Learning of Functionally Compositional Structures [7.99536002595393]
This dissertation presents a general-purpose framework for lifelong learning of functionally compositional structures.
The framework separates the learning into two stages: learning how to combine existing components to assimilate a novel problem, and learning how to adapt the existing components to accommodate the new problem.
Supervised learning evaluations found that 1) compositional models improve lifelong learning of diverse tasks, 2) the multi-stage process permits lifelong learning of compositional knowledge, and 3) the components learned by the framework represent self-contained and reusable functions.
arXiv Detail & Related papers (2022-07-25T15:24:25Z) - Environment Generation for Zero-Shot Compositional Reinforcement
Learning [105.35258025210862]
Compositional Design of Environments (CoDE) trains a Generator agent to automatically build a series of compositional tasks tailored to the agent's current skill level.
We learn to generate environments composed of multiple pages or rooms, and train RL agents capable of completing wide-range of complex tasks in those environments.
CoDE yields 4x higher success rate than the strongest baseline, and demonstrates strong performance of real websites learned on 3500 primitive tasks.
arXiv Detail & Related papers (2022-01-21T21:35:01Z) - Reset-Free Reinforcement Learning via Multi-Task Learning: Learning
Dexterous Manipulation Behaviors without Human Intervention [67.1936055742498]
We show that multi-task learning can effectively scale reset-free learning schemes to much more complex problems.
This work shows the ability to learn dexterous manipulation behaviors in the real world with RL without any human intervention.
arXiv Detail & Related papers (2021-04-22T17:38:27Z) - Lifelong Learning of Compositional Structures [26.524289609910653]
We present a general-purpose framework for lifelong learning of compositional structures.
Our framework separates the learning process into two broad stages: learning how to best combine existing components in order to assimilate a novel problem, and learning how to adapt the set of existing components to accommodate the new problem.
arXiv Detail & Related papers (2020-07-15T14:58:48Z) - Compositional Generalization by Learning Analytical Expressions [87.15737632096378]
A memory-augmented neural model is connected with analytical expressions to achieve compositional generalization.
Experiments on the well-known benchmark SCAN demonstrate that our model seizes a great ability of compositional generalization.
arXiv Detail & Related papers (2020-06-18T15:50:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.