DeepLTL: Learning to Efficiently Satisfy Complex LTL Specifications
- URL: http://arxiv.org/abs/2410.04631v1
- Date: Sun, 6 Oct 2024 21:30:38 GMT
- Title: DeepLTL: Learning to Efficiently Satisfy Complex LTL Specifications
- Authors: Mathias Jackermeier, Alessandro Abate,
- Abstract summary: Linear temporal logic (LTL) has recently been adopted as a powerful formalism for specifying complex, temporally extended tasks in reinforcement learning (RL)
Existing approaches suffer from several shortcomings: they are often only applicable to finite-horizon fragments, are restricted to suboptimal solutions, and do not adequately handle safety constraints.
In this work, we propose a novel learning approach to address these concerns.
Our method leverages the structure of B"uchia, which explicitly represent the semantics of automat- specifications, to learn policies conditioned on sequences of truth assignments that lead to satisfying the desired formulae.
- Score: 59.01527054553122
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Linear temporal logic (LTL) has recently been adopted as a powerful formalism for specifying complex, temporally extended tasks in reinforcement learning (RL). However, learning policies that efficiently satisfy arbitrary specifications not observed during training remains a challenging problem. Existing approaches suffer from several shortcomings: they are often only applicable to finite-horizon fragments of LTL, are restricted to suboptimal solutions, and do not adequately handle safety constraints. In this work, we propose a novel learning approach to address these concerns. Our method leverages the structure of B\"uchi automata, which explicitly represent the semantics of LTL specifications, to learn policies conditioned on sequences of truth assignments that lead to satisfying the desired formulae. Experiments in a variety of discrete and continuous domains demonstrate that our approach is able to zero-shot satisfy a wide range of finite- and infinite-horizon specifications, and outperforms existing methods in terms of both satisfaction probability and efficiency.
Related papers
- Directed Exploration in Reinforcement Learning from Linear Temporal Logic [59.707408697394534]
Linear temporal logic (LTL) is a powerful language for task specification in reinforcement learning.
We show that the synthesized reward signal remains fundamentally sparse, making exploration challenging.
We show how better exploration can be achieved by further leveraging the specification and casting its corresponding Limit Deterministic B"uchi Automaton (LDBA) as a Markov reward process.
arXiv Detail & Related papers (2024-08-18T14:25:44Z) - LTL-Constrained Policy Optimization with Cycle Experience Replay [19.43224037705577]
Linear Logic (LTL) offers a precise means for constraining the behavior of reinforcement learning agents.
We present Cycle Experience (CyclER), a reward-shaping approach to this problem that allows continuous state and action spaces.
arXiv Detail & Related papers (2024-04-17T17:24:44Z) - Neural Fields with Hard Constraints of Arbitrary Differential Order [61.49418682745144]
We develop a series of approaches for enforcing hard constraints on neural fields.
The constraints can be specified as a linear operator applied to the neural field and its derivatives.
Our approaches are demonstrated in a wide range of real-world applications.
arXiv Detail & Related papers (2023-06-15T08:33:52Z) - An Option-Dependent Analysis of Regret Minimization Algorithms in
Finite-Horizon Semi-Markov Decision Processes [47.037877670620524]
We present an option-dependent upper bound to the regret suffered by regret minimization algorithms in finite-horizon problems.
We illustrate that the performance improvement derives from the planning horizon reduction induced by the temporal abstraction enforced by the hierarchical structure.
arXiv Detail & Related papers (2023-05-10T15:00:05Z) - Reinforcement Learning for General LTL Objectives Is Intractable [10.69663517250214]
We formalize the problem under the probably correct learning in Markov decision processes (PACMDP) framework.
Our result implies it is impossible for a reinforcement-learning algorithm to obtain a PAC-MDP guarantee on the performance of its learned policy.
arXiv Detail & Related papers (2021-11-24T18:26:13Z) - Learning Implicitly with Noisy Data in Linear Arithmetic [94.66549436482306]
We extend implicit learning in PAC-Semantics to handle intervals and threshold uncertainty in the language of linear arithmetic.
We show that our implicit approach to learning optimal linear programming objective constraints significantly outperforms an explicit approach in practice.
arXiv Detail & Related papers (2020-10-23T19:08:46Z) - Certified Reinforcement Learning with Logic Guidance [78.2286146954051]
We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs)
The algorithm is guaranteed to synthesise a control policy whose traces satisfy the specification with maximal probability.
arXiv Detail & Related papers (2019-02-02T20:09:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.