GoSafe: Globally Optimal Safe Robot Learning
- URL: http://arxiv.org/abs/2105.13281v1
- Date: Thu, 27 May 2021 16:27:47 GMT
- Title: GoSafe: Globally Optimal Safe Robot Learning
- Authors: Dominik Baumann and Alonso Marco and Matteo Turchetta and Sebastian
Trimpe
- Abstract summary: SafeOpt is an efficient Bayesian optimization algorithm that can learn policies while guaranteeing safety with high probability.
We extend this method by exploring outside the initial safe area while still guaranteeing safety with high probability.
We derive conditions for guaranteed convergence to the global optimum and validate GoSafe in hardware experiments.
- Score: 11.77348161331335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When learning policies for robotic systems from data, safety is a major
concern, as violation of safety constraints may cause hardware damage. SafeOpt
is an efficient Bayesian optimization (BO) algorithm that can learn policies
while guaranteeing safety with high probability. However, its search space is
limited to an initially given safe region. We extend this method by exploring
outside the initial safe area while still guaranteeing safety with high
probability. This is achieved by learning a set of initial conditions from
which we can recover safely using a learned backup controller in case of a
potential failure. We derive conditions for guaranteed convergence to the
global optimum and validate GoSafe in hardware experiments.
Related papers
- Safe Bayesian Optimization for the Control of High-Dimensional Embodied Systems [8.69908615905782]
Current safe exploration algorithms exhibit inefficiency and may even become infeasible with large high-dimensional input spaces.
Existing high-dimensional constrained optimization methods neglect safety in the search process.
arXiv Detail & Related papers (2024-12-29T04:42:50Z) - Safety through Permissibility: Shield Construction for Fast and Safe Reinforcement Learning [57.84059344739159]
"Shielding" is a popular technique to enforce safety inReinforcement Learning (RL)
We propose a new permissibility-based framework to deal with safety and shield construction.
arXiv Detail & Related papers (2024-05-29T18:00:21Z) - Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark [12.660770759420286]
We present an environment suite called Safety-Gymnasium, which encompasses safety-critical tasks in both single and multi-agent scenarios.
We offer a library of algorithms named Safe Policy Optimization (SafePO), comprising 16 state-of-the-art SafeRL algorithms.
arXiv Detail & Related papers (2023-10-19T08:19:28Z) - Searching for Optimal Runtime Assurance via Reachability and
Reinforcement Learning [2.422636931175853]
runtime assurance system (RTA) for a given plant enables the exercise of an untrusted or experimental controller while assuring safety with a backup controller.
Existing RTA design strategies are well-known to be overly conservative and, in principle, can lead to safety violations.
In this paper, we formulate the optimal RTA design problem and present a new approach for solving it.
arXiv Detail & Related papers (2023-10-06T14:45:57Z) - Meta-Learning Priors for Safe Bayesian Optimization [72.8349503901712]
We build on a meta-learning algorithm, F-PACOH, capable of providing reliable uncertainty quantification in settings of data scarcity.
As core contribution, we develop a novel framework for choosing safety-compliant priors in a data-riven manner.
On benchmark functions and a high-precision motion system, we demonstrate that our meta-learned priors accelerate the convergence of safe BO approaches.
arXiv Detail & Related papers (2022-10-03T08:38:38Z) - Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z) - Safe Reinforcement Learning via Confidence-Based Filters [78.39359694273575]
We develop a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard reinforcement learning techniques.
We provide formal safety guarantees, and empirically demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-07-04T11:43:23Z) - GoSafeOpt: Scalable Safe Exploration for Global Optimization of
Dynamical Systems [75.22958991597069]
This work proposes GoSafeOpt as the first algorithm that can safely discover globally optimal policies for high-dimensional systems.
We demonstrate the superiority of GoSafeOpt over competing model-free safe learning methods on a robot arm.
arXiv Detail & Related papers (2022-01-24T10:05:44Z) - Conservative Safety Critics for Exploration [120.73241848565449]
We study the problem of safe exploration in reinforcement learning (RL)
We learn a conservative safety estimate of environment states through a critic.
We show that the proposed approach can achieve competitive task performance while incurring significantly lower catastrophic failure rates.
arXiv Detail & Related papers (2020-10-27T17:54:25Z) - Safe reinforcement learning for probabilistic reachability and safety
specifications: A Lyapunov-based approach [2.741266294612776]
We propose a model-free safety specification method that learns the maximal probability of safe operation.
Our approach constructs a Lyapunov function with respect to a safe policy to restrain each policy improvement stage.
It yields a sequence of safe policies that determine the range of safe operation, called the safe set.
arXiv Detail & Related papers (2020-02-24T09:20:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.