A computationally lightweight safe learning algorithm
- URL: http://arxiv.org/abs/2309.03672v1
- Date: Thu, 7 Sep 2023 12:21:22 GMT
- Title: A computationally lightweight safe learning algorithm
- Authors: Dominik Baumann and Krzysztof Kowalczyk and Koen Tiels and Pawe{\l}
Wachel
- Abstract summary: We propose a safe learning algorithm that provides probabilistic safety guarantees but leverages the Nadaraya-Watson estimator.
We provide theoretical guarantees for the estimates, embed them into a safe learning algorithm, and show numerical experiments on a simulated seven-degrees-of-freedom robot manipulator.
- Score: 1.9295598343317182
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Safety is an essential asset when learning control policies for physical
systems, as violating safety constraints during training can lead to expensive
hardware damage. In response to this need, the field of safe learning has
emerged with algorithms that can provide probabilistic safety guarantees
without knowledge of the underlying system dynamics. Those algorithms often
rely on Gaussian process inference. Unfortunately, Gaussian process inference
scales cubically with the number of data points, limiting applicability to
high-dimensional and embedded systems. In this paper, we propose a safe
learning algorithm that provides probabilistic safety guarantees but leverages
the Nadaraya-Watson estimator instead of Gaussian processes. For the
Nadaraya-Watson estimator, we can reach logarithmic scaling with the number of
data points. We provide theoretical guarantees for the estimates, embed them
into a safe learning algorithm, and show numerical experiments on a simulated
seven-degrees-of-freedom robot manipulator.
Related papers
- Safe Reinforcement Learning for Constrained Markov Decision Processes with Stochastic Stopping Time [0.6554326244334868]
We present an online reinforcement learning algorithm for constrained Markov decision processes with a safety constraint.
We show that the learned policy is safe with high confidence.
We also demonstrate that efficient exploration can be achieved by defining a subset of the state-space called proxy set.
arXiv Detail & Related papers (2024-03-23T20:22:30Z) - Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS)
We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises.
We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z) - The #DNN-Verification Problem: Counting Unsafe Inputs for Deep Neural
Networks [94.63547069706459]
#DNN-Verification problem involves counting the number of input configurations of a DNN that result in a violation of a safety property.
We propose a novel approach that returns the exact count of violations.
We present experimental results on a set of safety-critical benchmarks.
arXiv Detail & Related papers (2023-01-17T18:32:01Z) - Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - ProBF: Learning Probabilistic Safety Certificates with Barrier Functions [31.203344483485843]
The control barrier function is a useful tool to guarantee safety if we have access to the ground-truth system dynamics.
In practice, we have inaccurate knowledge of the system dynamics, which can lead to unsafe behaviors.
We show the efficacy of this method through experiments on Segway and Quadrotor simulations.
arXiv Detail & Related papers (2021-12-22T20:18:18Z) - Learning to be safe, in finite time [4.189643331553922]
This paper aims to put forward the concept that learning to take safe actions in unknown environments, even with probability one guarantees, can be achieved without the need for an unbounded number of exploratory trials.
We focus on the canonical multi-armed bandit problem and seek to study the exploration-preservation trade-off intrinsic within safe learning.
arXiv Detail & Related papers (2020-10-01T14:03:34Z) - Chance-Constrained Trajectory Optimization for Safe Exploration and
Learning of Nonlinear Systems [81.7983463275447]
Learning-based control algorithms require data collection with abundant supervision for training.
We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
arXiv Detail & Related papers (2020-05-09T05:57:43Z) - Statistical Limits of Supervised Quantum Learning [90.0289160657379]
We show that if the bound on the accuracy is taken into account, quantum machine learning algorithms for supervised learning cannot achieve polylogarithmic runtimes in the input dimension.
We conclude that, when no further assumptions on the problem are made, quantum machine learning algorithms for supervised learning can have at most speedups over efficient classical algorithms.
arXiv Detail & Related papers (2020-01-28T17:35:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.