Model-based Reinforcement Learning from Signal Temporal Logic
Specifications
- URL: http://arxiv.org/abs/2011.04950v1
- Date: Tue, 10 Nov 2020 07:31:47 GMT
- Title: Model-based Reinforcement Learning from Signal Temporal Logic
Specifications
- Authors: Parv Kapoor, Anand Balakrishnan, Jyotirmoy V. Deshmukh
- Abstract summary: We propose expressing desired high-level robot behavior using a formal specification language known as Signal Temporal Logic (STL) as an alternative to reward/cost functions.
The proposed algorithm is empirically evaluated on simulations of robotic system such as a pick-and-place robotic arm, and adaptive cruise control for autonomous vehicles.
- Score: 0.17205106391379021
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Techniques based on Reinforcement Learning (RL) are increasingly being used
to design control policies for robotic systems. RL fundamentally relies on
state-based reward functions to encode desired behavior of the robot and bad
reward functions are prone to exploitation by the learning agent, leading to
behavior that is undesirable in the best case and critically dangerous in the
worst. On the other hand, designing good reward functions for complex tasks is
a challenging problem. In this paper, we propose expressing desired high-level
robot behavior using a formal specification language known as Signal Temporal
Logic (STL) as an alternative to reward/cost functions. We use STL
specifications in conjunction with model-based learning to design model
predictive controllers that try to optimize the satisfaction of the STL
specification over a finite time horizon. The proposed algorithm is empirically
evaluated on simulations of robotic system such as a pick-and-place robotic
arm, and adaptive cruise control for autonomous vehicles.
Related papers
- Mission-driven Exploration for Accelerated Deep Reinforcement Learning
with Temporal Logic Task Specifications [11.812602599752294]
We consider robots with unknown dynamics operating in environments with unknown structure.
Our goal is to synthesize a control policy that maximizes the probability of satisfying an automaton-encoded task.
We propose a novel DRL algorithm, which has the capability to learn control policies at a notably faster rate compared to similar methods.
arXiv Detail & Related papers (2023-11-28T18:59:58Z) - Signal Temporal Logic Neural Predictive Control [15.540490027770621]
We propose a method to learn a neural network controller to satisfy the requirements specified in Signal temporal logic (STL)
Our controller learns to roll out trajectories to maximize the STL robustness score in training.
A backup policy is designed to ensure safety when our controller fails.
arXiv Detail & Related papers (2023-09-10T20:31:25Z) - Facilitating Sim-to-real by Intrinsic Stochasticity of Real-Time
Simulation in Reinforcement Learning for Robot Manipulation [1.6686307101054858]
We investigate the properties of intrinsicity of real-time simulation (RT-IS) of off-the-shelf simulation software.
RT-IS requires less randomization, is not task-dependent, and achieves better generalizability than the conventional domain-randomization-powered agents.
arXiv Detail & Related papers (2023-04-12T12:15:31Z) - Active Predicting Coding: Brain-Inspired Reinforcement Learning for
Sparse Reward Robotic Control Problems [79.07468367923619]
We propose a backpropagation-free approach to robotic control through the neuro-cognitive computational framework of neural generative coding (NGC)
We design an agent built completely from powerful predictive coding/processing circuits that facilitate dynamic, online learning from sparse rewards.
We show that our proposed ActPC agent performs well in the face of sparse (extrinsic) reward signals and is competitive with or outperforms several powerful backprop-based RL approaches.
arXiv Detail & Related papers (2022-09-19T16:49:32Z) - Real-to-Sim: Predicting Residual Errors of Robotic Systems with Sparse
Data using a Learning-based Unscented Kalman Filter [65.93205328894608]
We learn the residual errors between a dynamic and/or simulator model and the real robot.
We show that with the learned residual errors, we can further close the reality gap between dynamic models, simulations, and actual hardware.
arXiv Detail & Related papers (2022-09-07T15:15:12Z) - Improving Input-Output Linearizing Controllers for Bipedal Robots via
Reinforcement Learning [85.13138591433635]
The main drawbacks of input-output linearizing controllers are the need for precise dynamics models and not being able to account for input constraints.
In this paper, we address both challenges for the specific case of bipedal robot control by the use of reinforcement learning techniques.
arXiv Detail & Related papers (2020-04-15T18:15:49Z) - Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot
Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO)
We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z) - Certified Reinforcement Learning with Logic Guidance [78.2286146954051]
We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs)
The algorithm is guaranteed to synthesise a control policy whose traces satisfy the specification with maximal probability.
arXiv Detail & Related papers (2019-02-02T20:09:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.