Signal Temporal Logic Neural Predictive Control
- URL: http://arxiv.org/abs/2309.05131v1
- Date: Sun, 10 Sep 2023 20:31:25 GMT
- Title: Signal Temporal Logic Neural Predictive Control
- Authors: Yue Meng and Chuchu Fan
- Abstract summary: We propose a method to learn a neural network controller to satisfy the requirements specified in Signal temporal logic (STL)
Our controller learns to roll out trajectories to maximize the STL robustness score in training.
A backup policy is designed to ensure safety when our controller fails.
- Score: 15.540490027770621
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ensuring safety and meeting temporal specifications are critical challenges
for long-term robotic tasks. Signal temporal logic (STL) has been widely used
to systematically and rigorously specify these requirements. However,
traditional methods of finding the control policy under those STL requirements
are computationally complex and not scalable to high-dimensional or systems
with complex nonlinear dynamics. Reinforcement learning (RL) methods can learn
the policy to satisfy the STL specifications via hand-crafted or STL-inspired
rewards, but might encounter unexpected behaviors due to ambiguity and sparsity
in the reward. In this paper, we propose a method to directly learn a neural
network controller to satisfy the requirements specified in STL. Our controller
learns to roll out trajectories to maximize the STL robustness score in
training. In testing, similar to Model Predictive Control (MPC), the learned
controller predicts a trajectory within a planning horizon to ensure the
satisfaction of the STL requirement in deployment. A backup policy is designed
to ensure safety when our controller fails. Our approach can adapt to various
initial conditions and environmental parameters. We conduct experiments on six
tasks, where our method with the backup policy outperforms the classical
methods (MPC, STL-solver), model-free and model-based RL methods in STL
satisfaction rate, especially on tasks with complex STL specifications while
being 10X-100X faster than the classical methods.
Related papers
- There is HOPE to Avoid HiPPOs for Long-memory State Space Models [51.66430224089725]
State-space models (SSMs) that utilize linear, time-invariant (LTI) systems are known for their effectiveness in learning long sequences.
We develop a new parameterization scheme, called HOPE, for LTI systems that utilizes parameters within Hankel operators.
Our model efficiently implements these innovations by nonuniformly sampling the transfer functions of LTI systems.
arXiv Detail & Related papers (2024-05-22T20:20:14Z) - DATT: Deep Adaptive Trajectory Tracking for Quadrotor Control [62.24301794794304]
Deep Adaptive Trajectory Tracking (DATT) is a learning-based approach that can precisely track arbitrary, potentially infeasible trajectories in the presence of large disturbances in the real world.
DATT significantly outperforms competitive adaptive nonlinear and model predictive controllers for both feasible smooth and infeasible trajectories in unsteady wind fields.
It can efficiently run online with an inference time less than 3.2 ms, less than 1/4 of the adaptive nonlinear model predictive control baseline.
arXiv Detail & Related papers (2023-10-13T12:22:31Z) - Learning Robust and Correct Controllers from Signal Temporal Logic
Specifications Using BarrierNet [5.809331819510702]
We exploit STL quantitative semantics to define a notion of robust satisfaction.
We construct a set of trainable High Order Control Barrier Functions (HOCBFs) enforcing the satisfaction of formulas in a fragment of STL.
We train the HOCBFs together with other neural network parameters to further improve the robustness of the controller.
arXiv Detail & Related papers (2023-04-12T21:12:15Z) - Thompson Sampling Achieves $\tilde O(\sqrt{T})$ Regret in Linear
Quadratic Control [85.22735611954694]
We study the problem of adaptive control of stabilizable linear-quadratic regulators (LQRs) using Thompson Sampling (TS)
We propose an efficient TS algorithm for the adaptive control of LQRs, TSAC, that attains $tilde O(sqrtT)$ regret, even for multidimensional systems.
arXiv Detail & Related papers (2022-06-17T02:47:53Z) - Safe RAN control: A Symbolic Reinforcement Learning Approach [62.997667081978825]
We present a Symbolic Reinforcement Learning (SRL) based architecture for safety control of Radio Access Network (RAN) applications.
We provide a purely automated procedure in which a user can specify high-level logical safety specifications for a given cellular network topology.
We introduce a user interface (UI) developed to help a user set intent specifications to the system, and inspect the difference in agent proposed actions.
arXiv Detail & Related papers (2021-06-03T16:45:40Z) - Learning Optimal Strategies for Temporal Tasks in Stochastic Games [23.012106429532633]
We introduce a model-free reinforcement learning (RL) approach to derive controllers from given specifications.
We learn optimal control strategies that maximize the probability of satisfying the specifications against the worst-case environment behavior.
arXiv Detail & Related papers (2021-02-08T16:10:50Z) - Model-based Reinforcement Learning from Signal Temporal Logic
Specifications [0.17205106391379021]
We propose expressing desired high-level robot behavior using a formal specification language known as Signal Temporal Logic (STL) as an alternative to reward/cost functions.
The proposed algorithm is empirically evaluated on simulations of robotic system such as a pick-and-place robotic arm, and adaptive cruise control for autonomous vehicles.
arXiv Detail & Related papers (2020-11-10T07:31:47Z) - Recurrent Neural Network Controllers for Signal Temporal Logic
Specifications Subject to Safety Constraints [0.2320417845168326]
We propose a framework based on Recurrent Neural Networks (RNNs) to determine an optimal control strategy for a discrete-time system.
RNNs can store information of a system over time, thus, enable us to determine satisfaction of the dynamic temporal requirements specified in Signal Temporal Logic formulae.
arXiv Detail & Related papers (2020-09-24T03:34:02Z) - Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot
Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO)
We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z) - Certified Reinforcement Learning with Logic Guidance [78.2286146954051]
We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs)
The algorithm is guaranteed to synthesise a control policy whose traces satisfy the specification with maximal probability.
arXiv Detail & Related papers (2019-02-02T20:09:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.