Lifelong Reinforcement Learning with Temporal Logic Formulas and Reward
Machines
- URL: http://arxiv.org/abs/2111.09475v1
- Date: Thu, 18 Nov 2021 02:02:08 GMT
- Title: Lifelong Reinforcement Learning with Temporal Logic Formulas and Reward
Machines
- Authors: Xuejing Zheng, Chao Yu, Chen Chen, Jianye Hao, Hankz Hankui Zhuo
- Abstract summary: We propose Lifelong reinforcement learning with Sequential linear temporal logic formulas and Reward Machines (LSRM)
We first introduce Sequential Linear Temporal Logic (SLTL), which is a supplement to the existing Linear Temporal Logic formal language.
We then utilize Reward Machines (RM) to exploit structural reward functions for tasks encoded with high-level events.
- Score: 30.161550541362487
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continuously learning new tasks using high-level ideas or knowledge is a key
capability of humans. In this paper, we propose Lifelong reinforcement learning
with Sequential linear temporal logic formulas and Reward Machines (LSRM),
which enables an agent to leverage previously learned knowledge to fasten
learning of logically specified tasks. For the sake of more flexible
specification of tasks, we first introduce Sequential Linear Temporal Logic
(SLTL), which is a supplement to the existing Linear Temporal Logic (LTL)
formal language. We then utilize Reward Machines (RM) to exploit structural
reward functions for tasks encoded with high-level events, and propose
automatic extension of RM and efficient knowledge transfer over tasks for
continuous learning in lifetime. Experimental results show that LSRM
outperforms the methods that learn the target tasks from scratch by taking
advantage of the task decomposition using SLTL and knowledge transfer over RM
during the lifelong learning process.
Related papers
- Exploring RL-based LLM Training for Formal Language Tasks with Programmed Rewards [49.7719149179179]
This paper investigates the feasibility of using PPO for reinforcement learning (RL) from explicitly programmed reward signals.
We focus on tasks expressed through formal languages, such as programming, where explicit reward functions can be programmed to automatically assess quality of generated outputs.
Our results show that pure RL-based training for the two formal language tasks is challenging, with success being limited even for the simple arithmetic task.
arXiv Detail & Related papers (2024-10-22T15:59:58Z) - Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making.
Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations.
Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z) - Knowledge Tagging System on Math Questions via LLMs with Flexible Demonstration Retriever [48.5585921817745]
Large Language Models (LLMs) are used to automate the knowledge tagging task.
We show the strong performance of zero- and few-shot results over math questions knowledge tagging tasks.
By proposing a reinforcement learning-based demonstration retriever, we successfully exploit the great potential of different-sized LLMs.
arXiv Detail & Related papers (2024-06-19T23:30:01Z) - Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More? [54.667202878390526]
Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases.
We introduce LOFT, a benchmark of real-world tasks requiring context up to millions of tokens designed to evaluate LCLMs' performance on in-context retrieval and reasoning.
Our findings reveal LCLMs' surprising ability to rival state-of-the-art retrieval and RAG systems, despite never having been explicitly trained for these tasks.
arXiv Detail & Related papers (2024-06-19T00:28:58Z) - PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control [55.81022882408587]
Temporal action abstractions, along with belief state representations, are a powerful knowledge sharing mechanism for sequential decision making.
We propose a novel view that treats inducing temporal action abstractions as a sequence compression problem.
We introduce an approach that combines continuous action quantization with byte pair encoding to learn powerful action abstractions.
arXiv Detail & Related papers (2024-02-16T04:55:09Z) - Logical Specifications-guided Dynamic Task Sampling for Reinforcement Learning Agents [9.529492371336286]
Reinforcement Learning (RL) has made significant strides in enabling artificial agents to learn diverse behaviors.
We propose a novel approach, called Logical Specifications-guided Dynamic Task Sampling (LSTS)
LSTS learns a set of RL policies to guide an agent from an initial state to a goal state based on a high-level task specification.
arXiv Detail & Related papers (2024-02-06T04:00:21Z) - Chain of History: Learning and Forecasting with LLMs for Temporal
Knowledge Graph Completion [24.545917737620197]
Temporal Knowledge Graph Completion (TKGC) is a complex task involving the prediction of missing event links at future timestamps.
This paper aims to provide a comprehensive perspective on harnessing the advantages of Large Language Models for reasoning in temporal knowledge graphs.
arXiv Detail & Related papers (2024-01-11T17:42:47Z) - LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning [64.55001982176226]
LIBERO is a novel benchmark of lifelong learning for robot manipulation.
We focus on how to efficiently transfer declarative knowledge, procedural knowledge, or the mixture of both.
We develop an extendible procedural generation pipeline that can in principle generate infinitely many tasks.
arXiv Detail & Related papers (2023-06-05T23:32:26Z) - Lifelong Reinforcement Learning with Modulating Masks [16.24639836636365]
Lifelong learning aims to create AI systems that continuously and incrementally learn during a lifetime, similar to biological learning.
Attempts so far have met problems, including catastrophic forgetting, interference among tasks, and the inability to exploit previous knowledge.
We show that lifelong reinforcement learning with modulating masks is a promising approach to lifelong learning, to the composition of knowledge to learn increasingly complex tasks, and to knowledge reuse for efficient and faster learning.
arXiv Detail & Related papers (2022-12-21T15:49:20Z) - Self-Attention Meta-Learner for Continual Learning [5.979373021392084]
Self-Attention Meta-Learner (SAM) learns a prior knowledge for continual learning that permits learning a sequence of tasks.
SAM incorporates an attention mechanism that learns to select the particular relevant representation for each future task.
We evaluate the proposed method on the Split CIFAR-10/100 and Split MNIST benchmarks in the task inference.
arXiv Detail & Related papers (2021-01-28T17:35:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.