Signal Temporal Logic Neural Predictive Control
- URL: http://arxiv.org/abs/2309.05131v1
- Date: Sun, 10 Sep 2023 20:31:25 GMT
- Title: Signal Temporal Logic Neural Predictive Control
- Authors: Yue Meng and Chuchu Fan
- Abstract summary: We propose a method to learn a neural network controller to satisfy the requirements specified in Signal temporal logic (STL)
Our controller learns to roll out trajectories to maximize the STL robustness score in training.
A backup policy is designed to ensure safety when our controller fails.
- Score: 15.540490027770621
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ensuring safety and meeting temporal specifications are critical challenges
for long-term robotic tasks. Signal temporal logic (STL) has been widely used
to systematically and rigorously specify these requirements. However,
traditional methods of finding the control policy under those STL requirements
are computationally complex and not scalable to high-dimensional or systems
with complex nonlinear dynamics. Reinforcement learning (RL) methods can learn
the policy to satisfy the STL specifications via hand-crafted or STL-inspired
rewards, but might encounter unexpected behaviors due to ambiguity and sparsity
in the reward. In this paper, we propose a method to directly learn a neural
network controller to satisfy the requirements specified in STL. Our controller
learns to roll out trajectories to maximize the STL robustness score in
training. In testing, similar to Model Predictive Control (MPC), the learned
controller predicts a trajectory within a planning horizon to ensure the
satisfaction of the STL requirement in deployment. A backup policy is designed
to ensure safety when our controller fails. Our approach can adapt to various
initial conditions and environmental parameters. We conduct experiments on six
tasks, where our method with the backup policy outperforms the classical
methods (MPC, STL-solver), model-free and model-based RL methods in STL
satisfaction rate, especially on tasks with complex STL specifications while
being 10X-100X faster than the classical methods.
Related papers
- Regret-Free Reinforcement Learning for LTL Specifications [6.342676126028222]
Reinforcement learning is a promising method to learn optimal control policies for systems with unknown dynamics.
Current RL-based methods offer only guarantees, which provide no insight into the transient performance during the learning phase.
We present the first regret-free online algorithm for learning a controller that addresses the general class of specifications over Markov decision processes.
arXiv Detail & Related papers (2024-11-18T20:01:45Z) - DeepLTL: Learning to Efficiently Satisfy Complex LTL Specifications [59.01527054553122]
Linear temporal logic (LTL) has recently been adopted as a powerful formalism for specifying complex, temporally extended tasks in reinforcement learning (RL)
Existing approaches suffer from several shortcomings: they are often only applicable to finite-horizon fragments, are restricted to suboptimal solutions, and do not adequately handle safety constraints.
In this work, we propose a novel learning approach to address these concerns.
Our method leverages the structure of B"uchia, which explicitly represent the semantics of automat- specifications, to learn policies conditioned on sequences of truth assignments that lead to satisfying the desired formulae.
arXiv Detail & Related papers (2024-10-06T21:30:38Z) - Directed Exploration in Reinforcement Learning from Linear Temporal Logic [59.707408697394534]
Linear temporal logic (LTL) is a powerful language for task specification in reinforcement learning.
We show that the synthesized reward signal remains fundamentally sparse, making exploration challenging.
We show how better exploration can be achieved by further leveraging the specification and casting its corresponding Limit Deterministic B"uchi Automaton (LDBA) as a Markov reward process.
arXiv Detail & Related papers (2024-08-18T14:25:44Z) - Learning Robust and Correct Controllers from Signal Temporal Logic
Specifications Using BarrierNet [5.809331819510702]
We exploit STL quantitative semantics to define a notion of robust satisfaction.
We construct a set of trainable High Order Control Barrier Functions (HOCBFs) enforcing the satisfaction of formulas in a fragment of STL.
We train the HOCBFs together with other neural network parameters to further improve the robustness of the controller.
arXiv Detail & Related papers (2023-04-12T21:12:15Z) - Thompson Sampling Achieves $\tilde O(\sqrt{T})$ Regret in Linear
Quadratic Control [85.22735611954694]
We study the problem of adaptive control of stabilizable linear-quadratic regulators (LQRs) using Thompson Sampling (TS)
We propose an efficient TS algorithm for the adaptive control of LQRs, TSAC, that attains $tilde O(sqrtT)$ regret, even for multidimensional systems.
arXiv Detail & Related papers (2022-06-17T02:47:53Z) - Safe RAN control: A Symbolic Reinforcement Learning Approach [62.997667081978825]
We present a Symbolic Reinforcement Learning (SRL) based architecture for safety control of Radio Access Network (RAN) applications.
We provide a purely automated procedure in which a user can specify high-level logical safety specifications for a given cellular network topology.
We introduce a user interface (UI) developed to help a user set intent specifications to the system, and inspect the difference in agent proposed actions.
arXiv Detail & Related papers (2021-06-03T16:45:40Z) - Learning Optimal Strategies for Temporal Tasks in Stochastic Games [23.012106429532633]
We introduce a model-free reinforcement learning (RL) approach to derive controllers from given specifications.
We learn optimal control strategies that maximize the probability of satisfying the specifications against the worst-case environment behavior.
arXiv Detail & Related papers (2021-02-08T16:10:50Z) - Model-based Reinforcement Learning from Signal Temporal Logic
Specifications [0.17205106391379021]
We propose expressing desired high-level robot behavior using a formal specification language known as Signal Temporal Logic (STL) as an alternative to reward/cost functions.
The proposed algorithm is empirically evaluated on simulations of robotic system such as a pick-and-place robotic arm, and adaptive cruise control for autonomous vehicles.
arXiv Detail & Related papers (2020-11-10T07:31:47Z) - Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot
Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO)
We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z) - Certified Reinforcement Learning with Logic Guidance [78.2286146954051]
We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs)
The algorithm is guaranteed to synthesise a control policy whose traces satisfy the specification with maximal probability.
arXiv Detail & Related papers (2019-02-02T20:09:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.