Safe reinforcement learning for probabilistic reachability and safety
specifications: A Lyapunov-based approach
- URL: http://arxiv.org/abs/2002.10126v1
- Date: Mon, 24 Feb 2020 09:20:03 GMT
- Title: Safe reinforcement learning for probabilistic reachability and safety
specifications: A Lyapunov-based approach
- Authors: Subin Huh, Insoon Yang
- Abstract summary: We propose a model-free safety specification method that learns the maximal probability of safe operation.
Our approach constructs a Lyapunov function with respect to a safe policy to restrain each policy improvement stage.
It yields a sequence of safe policies that determine the range of safe operation, called the safe set.
- Score: 2.741266294612776
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Emerging applications in robotics and autonomous systems, such as autonomous
driving and robotic surgery, often involve critical safety constraints that
must be satisfied even when information about system models is limited. In this
regard, we propose a model-free safety specification method that learns the
maximal probability of safe operation by carefully combining probabilistic
reachability analysis and safe reinforcement learning (RL). Our approach
constructs a Lyapunov function with respect to a safe policy to restrain each
policy improvement stage. As a result, it yields a sequence of safe policies
that determine the range of safe operation, called the safe set, which
monotonically expands and gradually converges. We also develop an efficient
safe exploration scheme that accelerates the process of identifying the safety
of unexamined states. Exploiting the Lyapunov shielding, our method regulates
the exploratory policy to avoid dangerous states with high confidence. To
handle high-dimensional systems, we further extend our approach to deep RL by
introducing a Lagrangian relaxation technique to establish a tractable
actor-critic algorithm. The empirical performance of our method is demonstrated
through continuous control benchmark problems, such as a reaching task on a
planar robot arm.
Related papers
- Safeguarded Progress in Reinforcement Learning: Safe Bayesian
Exploration for Control Policy Synthesis [63.532413807686524]
This paper addresses the problem of maintaining safety during training in Reinforcement Learning (RL)
We propose a new architecture that handles the trade-off between efficient progress and safety during exploration.
arXiv Detail & Related papers (2023-12-18T16:09:43Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - ISAACS: Iterative Soft Adversarial Actor-Critic for Safety [0.9217021281095907]
This work introduces a novel approach enabling scalable synthesis of robust safety-preserving controllers for robotic systems.
A safety-seeking fallback policy is co-trained with an adversarial "disturbance" agent that aims to invoke the worst-case realization of model error.
While the learned control policy does not intrinsically guarantee safety, it is used to construct a real-time safety filter.
arXiv Detail & Related papers (2022-12-06T18:53:34Z) - Safe Exploration Method for Reinforcement Learning under Existence of
Disturbance [1.1470070927586016]
We deal with a safe exploration problem in reinforcement learning under the existence of disturbance.
We propose a safe exploration method that uses partial prior knowledge of a controlled object and disturbance.
We illustrate the validity and effectiveness of the proposed method through numerical simulations of an inverted pendulum and a four-bar parallel link robot manipulator.
arXiv Detail & Related papers (2022-09-30T13:00:33Z) - Safe Reinforcement Learning with Contrastive Risk Prediction [35.80144544954927]
We propose a risk preventive training method for safe RL, which learns a statistical contrastive classifier to predict the probability of a state-action pair leading to unsafe states.
Based on the predicted risk probabilities, we can collect risk preventive trajectories and reshape the reward function with risk penalties to induce safe RL policies.
The results show the proposed approach has comparable performance with the state-of-the-art model-based methods and outperforms conventional model-free safe RL approaches.
arXiv Detail & Related papers (2022-09-10T18:54:38Z) - Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - Safe Reinforcement Learning via Confidence-Based Filters [78.39359694273575]
We develop a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard reinforcement learning techniques.
We provide formal safety guarantees, and empirically demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-07-04T11:43:23Z) - Lyapunov-based uncertainty-aware safe reinforcement learning [0.0]
InReinforcement learning (RL) has shown a promising performance in learning optimal policies for a variety of sequential decision-making tasks.
In many real-world RL problems, besides optimizing the main objectives, the agent is expected to satisfy a certain level of safety.
We propose a Lyapunov-based uncertainty-aware safe RL model to address these limitations.
arXiv Detail & Related papers (2021-07-29T13:08:15Z) - Chance-Constrained Trajectory Optimization for Safe Exploration and
Learning of Nonlinear Systems [81.7983463275447]
Learning-based control algorithms require data collection with abundant supervision for training.
We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
arXiv Detail & Related papers (2020-05-09T05:57:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.