Funnel-based Reward Shaping for Signal Temporal Logic Tasks in
Reinforcement Learning
- URL: http://arxiv.org/abs/2212.03181v3
- Date: Sun, 3 Dec 2023 05:50:22 GMT
- Title: Funnel-based Reward Shaping for Signal Temporal Logic Tasks in
Reinforcement Learning
- Authors: Naman Saxena, Gorantla Sandeep, Pushpak Jagtap
- Abstract summary: We propose a tractable reinforcement learning algorithm to learn a controller that enforces Signal Temporal Logic (STL) specifications.
We demonstrate the utility of our approach on several STL tasks using different environments.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Signal Temporal Logic (STL) is a powerful framework for describing the
complex temporal and logical behaviour of the dynamical system. Numerous
studies have attempted to employ reinforcement learning to learn a controller
that enforces STL specifications; however, they have been unable to effectively
tackle the challenges of ensuring robust satisfaction in continuous state space
and maintaining tractability. In this paper, leveraging the concept of funnel
functions, we propose a tractable reinforcement learning algorithm to learn a
time-dependent policy for robust satisfaction of STL specification in
continuous state space. We demonstrate the utility of our approach on several
STL tasks using different environments.
Related papers
- DeepLTL: Learning to Efficiently Satisfy Complex LTL Specifications [59.01527054553122]
Linear temporal logic (LTL) has recently been adopted as a powerful formalism for specifying complex, temporally extended tasks in reinforcement learning (RL)
Existing approaches suffer from several shortcomings: they are often only applicable to finite-horizon fragments, are restricted to suboptimal solutions, and do not adequately handle safety constraints.
In this work, we propose a novel learning approach to address these concerns.
Our method leverages the structure of B"uchia, which explicitly represent the semantics of automat- specifications, to learn policies conditioned on sequences of truth assignments that lead to satisfying the desired formulae.
arXiv Detail & Related papers (2024-10-06T21:30:38Z) - Directed Exploration in Reinforcement Learning from Linear Temporal Logic [59.707408697394534]
Linear temporal logic (LTL) is a powerful language for task specification in reinforcement learning.
We show that the synthesized reward signal remains fundamentally sparse, making exploration challenging.
We show how better exploration can be achieved by further leveraging the specification and casting its corresponding Limit Deterministic B"uchi Automaton (LDBA) as a Markov reward process.
arXiv Detail & Related papers (2024-08-18T14:25:44Z) - Signal Temporal Logic Neural Predictive Control [15.540490027770621]
We propose a method to learn a neural network controller to satisfy the requirements specified in Signal temporal logic (STL)
Our controller learns to roll out trajectories to maximize the STL robustness score in training.
A backup policy is designed to ensure safety when our controller fails.
arXiv Detail & Related papers (2023-09-10T20:31:25Z) - Diagnostic Spatio-temporal Transformer with Faithful Encoding [54.02712048973161]
This paper addresses the task of anomaly diagnosis when the underlying data generation process has a complex-temporal (ST) dependency.
We formalize the problem as supervised dependency discovery, where the ST dependency is learned as a side product of time-series classification.
We show that temporal positional encoding used in existing ST transformer works has a serious limitation capturing frequencies in higher frequencies (short time scales)
We also propose a new ST dependency discovery framework, which can provide readily consumable diagnostic information in both spatial and temporal directions.
arXiv Detail & Related papers (2023-05-26T05:31:23Z) - STL-Based Synthesis of Feedback Controllers Using Reinforcement Learning [8.680676599607125]
Deep Reinforcement Learning (DRL) has the potential to be used for synthesizing feedback controllers (agents) for various complex systems with unknown dynamics.
In RL, the reward function plays a crucial role in specifying the desired behaviour of these agents.
We provide a systematic way of generating rewards in real-time by using the quantitative semantics of Signal Temporal Logic (STL)
We evaluate our STL-based reinforcement learning mechanism on several complex continuous control benchmarks and compare our STL semantics with those available in the literature in terms of their efficacy in synthesizing the controller agent.
arXiv Detail & Related papers (2022-12-02T08:31:46Z) - Deep reinforcement learning under signal temporal logic constraints
using Lagrangian relaxation [0.0]
In general, a constraint may be imposed on the decision making.
We consider the optimal decision making problems with constraints to complete temporal high-level tasks.
We propose a two-phase constrained DRL algorithm using the Lagrangian relaxation method.
arXiv Detail & Related papers (2022-01-21T00:56:25Z) - Learning from Demonstrations using Signal Temporal Logic [1.2182193687133713]
Learning-from-demonstrations is an emerging paradigm to obtain effective robot control policies.
We use Signal Temporal Logic to evaluate and rank the quality of demonstrations.
We show that our approach outperforms the state-of-the-art Maximum Causal Entropy Inverse Reinforcement Learning.
arXiv Detail & Related papers (2021-02-15T18:28:36Z) - Multi-Agent Reinforcement Learning with Temporal Logic Specifications [65.79056365594654]
We study the problem of learning to satisfy temporal logic specifications with a group of agents in an unknown environment.
We develop the first multi-agent reinforcement learning technique for temporal logic specifications.
We provide correctness and convergence guarantees for our main algorithm.
arXiv Detail & Related papers (2021-02-01T01:13:03Z) - Online Reinforcement Learning Control by Direct Heuristic Dynamic
Programming: from Time-Driven to Event-Driven [80.94390916562179]
Time-driven learning refers to the machine learning method that updates parameters in a prediction model continuously as new data arrives.
It is desirable to prevent the time-driven dHDP from updating due to insignificant system event such as noise.
We show how the event-driven dHDP algorithm works in comparison to the original time-driven dHDP.
arXiv Detail & Related papers (2020-06-16T05:51:25Z) - Tractable Reinforcement Learning of Signal Temporal Logic Objectives [0.0]
Signal temporal logic (STL) is an expressive language to specify time-bound real-world robotic tasks and safety specifications.
Learning to satisfy STL specifications often needs a sufficient length of state history to compute reward and the next action.
We propose a compact means to capture state history in a new augmented state-space representation.
arXiv Detail & Related papers (2020-01-26T15:23:54Z) - Certified Reinforcement Learning with Logic Guidance [78.2286146954051]
We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs)
The algorithm is guaranteed to synthesise a control policy whose traces satisfy the specification with maximal probability.
arXiv Detail & Related papers (2019-02-02T20:09:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.