Related papers: GoSafe: Globally Optimal Safe Robot Learning

GoSafe: Globally Optimal Safe Robot Learning

URL: http://arxiv.org/abs/2105.13281v1
Date: Thu, 27 May 2021 16:27:47 GMT
Title: GoSafe: Globally Optimal Safe Robot Learning
Authors: Dominik Baumann and Alonso Marco and Matteo Turchetta and Sebastian Trimpe
Abstract summary: SafeOpt is an efficient Bayesian optimization algorithm that can learn policies while guaranteeing safety with high probability. We extend this method by exploring outside the initial safe area while still guaranteeing safety with high probability. We derive conditions for guaranteed convergence to the global optimum and validate GoSafe in hardware experiments.
Score: 11.77348161331335
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: When learning policies for robotic systems from data, safety is a major concern, as violation of safety constraints may cause hardware damage. SafeOpt is an efficient Bayesian optimization (BO) algorithm that can learn policies while guaranteeing safety with high probability. However, its search space is limited to an initially given safe region. We extend this method by exploring outside the initial safe area while still guaranteeing safety with high probability. This is achieved by learning a set of initial conditions from which we can recover safely using a learned backup controller in case of a potential failure. We derive conditions for guaranteed convergence to the global optimum and validate GoSafe in hardware experiments.

Related papers

Safe Bayesian Optimization for the Control of High-Dimensional Embodied Systems [8.69908615905782]
Current safe exploration algorithms exhibit inefficiency and may even become infeasible with large high-dimensional input spaces. Existing high-dimensional constrained optimization methods neglect safety in the search process.
arXiv Detail & Related papers (2024-12-29T04:42:50Z)
Safety through Permissibility: Shield Construction for Fast and Safe Reinforcement Learning [57.84059344739159]
"Shielding" is a popular technique to enforce safety inReinforcement Learning (RL) We propose a new permissibility-based framework to deal with safety and shield construction.
arXiv Detail & Related papers (2024-05-29T18:00:21Z)
Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark [12.660770759420286]
We present an environment suite called Safety-Gymnasium, which encompasses safety-critical tasks in both single and multi-agent scenarios. We offer a library of algorithms named Safe Policy Optimization (SafePO), comprising 16 state-of-the-art SafeRL algorithms.
arXiv Detail & Related papers (2023-10-19T08:19:28Z)
Searching for Optimal Runtime Assurance via Reachability and Reinforcement Learning [2.422636931175853]
runtime assurance system (RTA) for a given plant enables the exercise of an untrusted or experimental controller while assuring safety with a backup controller. Existing RTA design strategies are well-known to be overly conservative and, in principle, can lead to safety violations. In this paper, we formulate the optimal RTA design problem and present a new approach for solving it.
arXiv Detail & Related papers (2023-10-06T14:45:57Z)
ISAACS: Iterative Soft Adversarial Actor-Critic for Safety [0.9217021281095907]
This work introduces a novel approach enabling scalable synthesis of robust safety-preserving controllers for robotic systems. A safety-seeking fallback policy is co-trained with an adversarial "disturbance" agent that aims to invoke the worst-case realization of model error. While the learned control policy does not intrinsically guarantee safety, it is used to construct a real-time safety filter.
arXiv Detail & Related papers (2022-12-06T18:53:34Z)
Meta-Learning Priors for Safe Bayesian Optimization [72.8349503901712]
We build on a meta-learning algorithm, F-PACOH, capable of providing reliable uncertainty quantification in settings of data scarcity. As core contribution, we develop a novel framework for choosing safety-compliant priors in a data-riven manner. On benchmark functions and a high-precision motion system, we demonstrate that our meta-learned priors accelerate the convergence of safe BO approaches.
arXiv Detail & Related papers (2022-10-03T08:38:38Z)
Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers. We then present the pointwise feasibility conditions of the resulting safety controller. We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z)
Safe Reinforcement Learning via Confidence-Based Filters [78.39359694273575]
We develop a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard reinforcement learning techniques. We provide formal safety guarantees, and empirically demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-07-04T11:43:23Z)
GoSafeOpt: Scalable Safe Exploration for Global Optimization of Dynamical Systems [75.22958991597069]
This work proposes GoSafeOpt as the first algorithm that can safely discover globally optimal policies for high-dimensional systems. We demonstrate the superiority of GoSafeOpt over competing model-free safe learning methods on a robot arm.
arXiv Detail & Related papers (2022-01-24T10:05:44Z)
Conservative Safety Critics for Exploration [120.73241848565449]
We study the problem of safe exploration in reinforcement learning (RL) We learn a conservative safety estimate of environment states through a critic. We show that the proposed approach can achieve competitive task performance while incurring significantly lower catastrophic failure rates.
arXiv Detail & Related papers (2020-10-27T17:54:25Z)
Safe reinforcement learning for probabilistic reachability and safety specifications: A Lyapunov-based approach [2.741266294612776]
We propose a model-free safety specification method that learns the maximal probability of safe operation. Our approach constructs a Lyapunov function with respect to a safe policy to restrain each policy improvement stage. It yields a sequence of safe policies that determine the range of safe operation, called the safe set.
arXiv Detail & Related papers (2020-02-24T09:20:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.