Model-Based Safe Policy Search from Signal Temporal Logic Specifications
Using Recurrent Neural Networks
- URL: http://arxiv.org/abs/2103.15938v1
- Date: Mon, 29 Mar 2021 20:21:55 GMT
- Title: Model-Based Safe Policy Search from Signal Temporal Logic Specifications
Using Recurrent Neural Networks
- Authors: Wenliang Liu and Calin Belta
- Abstract summary: We propose a policy search approach to learn controllers from specifications given as Signal Temporal Logic (STL) formulae.
The system model is unknown, and it is learned together with the control policy.
The results show that our approach can satisfy the given specification within very few system runs, and therefore it has the potential to be used for on-line control.
- Score: 1.005130974691351
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a policy search approach to learn controllers from specifications
given as Signal Temporal Logic (STL) formulae. The system model is unknown, and
it is learned together with the control policy. The model is implemented as a
feedforward neural network (FNN). To capture the history dependency of the STL
specification, we use a recurrent neural network (RNN) to implement the control
policy. In contrast to prevalent model-free methods, the learning approach
proposed here takes advantage of the learned model and is more efficient. We
use control barrier functions (CBFs) with the learned model to improve the
safety of the system. We validate our algorithm via simulations. The results
show that our approach can satisfy the given specification within very few
system runs, and therefore it has the potential to be used for on-line control.
Related papers
- A Neurosymbolic Approach to the Verification of Temporal Logic
Properties of Learning enabled Control Systems [0.0]
We present a model for the verification of Neural Network (NN) controllers for general STL specifications.
We also propose a new approach for neural network controllers with general activation functions.
arXiv Detail & Related papers (2023-03-07T04:08:33Z) - ConCerNet: A Contrastive Learning Based Framework for Automated
Conservation Law Discovery and Trustworthy Dynamical System Prediction [82.81767856234956]
This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling.
We show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics.
arXiv Detail & Related papers (2023-02-11T21:07:30Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Recurrent neural network-based Internal Model Control of unknown
nonlinear stable systems [0.30458514384586394]
Gated Recurrent Neural Networks (RNNs) have become popular tools for learning dynamical systems.
This paper aims to discuss how these networks can be adopted for the synthesis of Internal Model Control (IMC) architectures.
arXiv Detail & Related papers (2021-08-10T11:02:25Z) - Stochastic Deep Model Reference Adaptive Control [9.594432031144715]
We present a Deep Neural Network-based Model Reference Adaptive Control.
Deep Model Reference Adaptive Control uses a Lyapunov-based method to adapt the output-layer weights of the DNN model in real-time.
A data-driven supervised learning algorithm is used to update the inner-layers parameters.
arXiv Detail & Related papers (2021-08-04T14:05:09Z) - Reinforcement Learning with External Knowledge by using Logical Neural
Networks [67.46162586940905]
A recent neuro-symbolic framework called the Logical Neural Networks (LNNs) can simultaneously provide key-properties of both neural networks and symbolic logic.
We propose an integrated method that enables model-free reinforcement learning from external knowledge sources.
arXiv Detail & Related papers (2021-03-03T12:34:59Z) - Generating Probabilistic Safety Guarantees for Neural Network
Controllers [30.34898838361206]
We use a dynamics model to determine the output properties that must hold for a neural network controller to operate safely.
We develop an adaptive verification approach to efficiently generate an overapproximation of the neural network policy.
We show that our method is able to generate meaningful probabilistic safety guarantees for aircraft collision avoidance neural networks.
arXiv Detail & Related papers (2021-03-01T18:48:21Z) - Recurrent Neural Network Controllers for Signal Temporal Logic
Specifications Subject to Safety Constraints [0.2320417845168326]
We propose a framework based on Recurrent Neural Networks (RNNs) to determine an optimal control strategy for a discrete-time system.
RNNs can store information of a system over time, thus, enable us to determine satisfaction of the dynamic temporal requirements specified in Signal Temporal Logic formulae.
arXiv Detail & Related papers (2020-09-24T03:34:02Z) - Online Reinforcement Learning Control by Direct Heuristic Dynamic
Programming: from Time-Driven to Event-Driven [80.94390916562179]
Time-driven learning refers to the machine learning method that updates parameters in a prediction model continuously as new data arrives.
It is desirable to prevent the time-driven dHDP from updating due to insignificant system event such as noise.
We show how the event-driven dHDP algorithm works in comparison to the original time-driven dHDP.
arXiv Detail & Related papers (2020-06-16T05:51:25Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z) - Certified Reinforcement Learning with Logic Guidance [78.2286146954051]
We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs)
The algorithm is guaranteed to synthesise a control policy whose traces satisfy the specification with maximal probability.
arXiv Detail & Related papers (2019-02-02T20:09:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.