Verified Compositional Neuro-Symbolic Control for Stochastic Systems
with Temporal Logic Tasks
- URL: http://arxiv.org/abs/2311.10863v3
- Date: Wed, 22 Nov 2023 03:01:59 GMT
- Title: Verified Compositional Neuro-Symbolic Control for Stochastic Systems
with Temporal Logic Tasks
- Authors: Jun Wang, Haojun Chen, Zihe Sun, Yiannis Kantaros
- Abstract summary: Several methods have been proposed recently to learn neural network (NN) controllers for autonomous agents.
A key challenge within these approaches is that they often lack safety guarantees or the provided guarantees are impractical.
This paper aims to address this challenge by checking if there exists a temporal composition of the trained NN controllers.
- Score: 11.614036749291216
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Several methods have been proposed recently to learn neural network (NN)
controllers for autonomous agents, with unknown and stochastic dynamics, tasked
with complex missions captured by Linear Temporal Logic (LTL). Due to the
sample-inefficiency of the majority of these works, compositional learning
methods have been proposed decomposing the LTL specification into smaller
sub-tasks. Then, separate controllers are learned and composed to satisfy the
original task. A key challenge within these approaches is that they often lack
safety guarantees or the provided guarantees are impractical. This paper aims
to address this challenge. Particularly, we consider autonomous systems with
unknown and stochastic dynamics and LTL-encoded tasks. We assume that the
system is equipped with a finite set of base skills modeled by trained NN
feedback controllers. Our goal is to check if there exists a temporal
composition of the trained NN controllers - and if so, to compute it - that
will yield a composite system behavior that satisfies the assigned LTL task
with probability one. We propose a new approach that relies on a novel
integration of automata theory and data-driven reachability analysis tools for
NN-controlled stochastic systems. The resulting neuro-symbolic controller
allows the agent to generate safe behaviors for unseen complex temporal logic
tasks in a zero-shot fashion by leveraging its base skills. We show correctness
of the proposed method and we provide conditions under which it is complete. To
the best of our knowledge, this is the first work that designs verified
temporal compositions of NN controllers for unknown and stochastic systems.
Finally, we provide extensive numerical simulations and hardware experiments on
robot navigation tasks to demonstrate the proposed method.
Related papers
- Lyapunov-stable Neural Control for State and Output Feedback: A Novel Formulation [67.63756749551924]
Learning-based neural network (NN) control policies have shown impressive empirical performance in a wide range of tasks in robotics and control.
Lyapunov stability guarantees over the region-of-attraction (ROA) for NN controllers with nonlinear dynamical systems are challenging to obtain.
We demonstrate a new framework for learning NN controllers together with Lyapunov certificates using fast empirical falsification and strategic regularizations.
arXiv Detail & Related papers (2024-04-11T17:49:15Z) - Scaling Learning based Policy Optimization for Temporal Logic Tasks by Controller Network Dropout [4.421486904657393]
We introduce a model-based approach for training feedback controllers for an autonomous agent operating in a highly nonlinear environment.
We show how this learning problem is similar to training recurrent neural networks (RNNs), where the number of recurrent units is proportional to the temporal horizon of the agent's task objectives.
We introduce a novel gradient approximation algorithm based on the idea of dropout or gradient sampling.
arXiv Detail & Related papers (2024-03-23T12:53:51Z) - The Role of Foundation Models in Neuro-Symbolic Learning and Reasoning [54.56905063752427]
Neuro-Symbolic AI (NeSy) holds promise to ensure the safe deployment of AI systems.
Existing pipelines that train the neural and symbolic components sequentially require extensive labelling.
New architecture, NeSyGPT, fine-tunes a vision-language foundation model to extract symbolic features from raw data.
arXiv Detail & Related papers (2024-02-02T20:33:14Z) - Mission-driven Exploration for Accelerated Deep Reinforcement Learning
with Temporal Logic Task Specifications [11.812602599752294]
We consider robots with unknown dynamics operating in environments with unknown structure.
Our goal is to synthesize a control policy that maximizes the probability of satisfying an automaton-encoded task.
We propose a novel DRL algorithm, which has the capability to learn control policies at a notably faster rate compared to similar methods.
arXiv Detail & Related papers (2023-11-28T18:59:58Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Learning Robust and Correct Controllers from Signal Temporal Logic
Specifications Using BarrierNet [5.809331819510702]
We exploit STL quantitative semantics to define a notion of robust satisfaction.
We construct a set of trainable High Order Control Barrier Functions (HOCBFs) enforcing the satisfaction of formulas in a fragment of STL.
We train the HOCBFs together with other neural network parameters to further improve the robustness of the controller.
arXiv Detail & Related papers (2023-04-12T21:12:15Z) - STL-Based Synthesis of Feedback Controllers Using Reinforcement Learning [8.680676599607125]
Deep Reinforcement Learning (DRL) has the potential to be used for synthesizing feedback controllers (agents) for various complex systems with unknown dynamics.
In RL, the reward function plays a crucial role in specifying the desired behaviour of these agents.
We provide a systematic way of generating rewards in real-time by using the quantitative semantics of Signal Temporal Logic (STL)
We evaluate our STL-based reinforcement learning mechanism on several complex continuous control benchmarks and compare our STL semantics with those available in the literature in terms of their efficacy in synthesizing the controller agent.
arXiv Detail & Related papers (2022-12-02T08:31:46Z) - Formal Controller Synthesis for Markov Jump Linear Systems with
Uncertain Dynamics [64.72260320446158]
We propose a method for synthesising controllers for Markov jump linear systems.
Our method is based on a finite-state abstraction that captures both the discrete (mode-jumping) and continuous (stochastic linear) behaviour of the MJLS.
We apply our method to multiple realistic benchmark problems, in particular, a temperature control and an aerial vehicle delivery problem.
arXiv Detail & Related papers (2022-12-01T17:36:30Z) - Provably Safe Model-Based Meta Reinforcement Learning: An
Abstraction-Based Approach [3.569867801312134]
We consider the problem of training a provably safe Neural Network (NN) controller for uncertain nonlinear dynamical systems.
Our approach is to learn a set of NN controllers during the training phase.
When the task becomes available at runtime, our framework will carefully select a subset of these NN controllers and compose them to form the final NN controller.
arXiv Detail & Related papers (2021-09-03T00:38:05Z) - Learning to Control PDEs with Differentiable Physics [102.36050646250871]
We present a novel hierarchical predictor-corrector scheme which enables neural networks to learn to understand and control complex nonlinear physical systems over long time frames.
We demonstrate that our method successfully develops an understanding of complex physical systems and learns to control them for tasks involving PDEs.
arXiv Detail & Related papers (2020-01-21T11:58:41Z) - Certified Reinforcement Learning with Logic Guidance [78.2286146954051]
We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs)
The algorithm is guaranteed to synthesise a control policy whose traces satisfy the specification with maximal probability.
arXiv Detail & Related papers (2019-02-02T20:09:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.