Safe Bayesian Optimization for the Control of High-Dimensional Embodied Systems
- URL: http://arxiv.org/abs/2412.20350v1
- Date: Sun, 29 Dec 2024 04:42:50 GMT
- Title: Safe Bayesian Optimization for the Control of High-Dimensional Embodied Systems
- Authors: Yunyue Wei, Zeji Yi, Hongda Li, Saraswati Soedarmadji, Yanan Sui,
- Abstract summary: Current safe exploration algorithms exhibit inefficiency and may even become infeasible with large high-dimensional input spaces.
Existing high-dimensional constrained optimization methods neglect safety in the search process.
- Score: 8.69908615905782
- License:
- Abstract: Learning to move is a primary goal for animals and robots, where ensuring safety is often important when optimizing control policies on the embodied systems. For complex tasks such as the control of human or humanoid control, the high-dimensional parameter space adds complexity to the safe optimization effort. Current safe exploration algorithms exhibit inefficiency and may even become infeasible with large high-dimensional input spaces. Furthermore, existing high-dimensional constrained optimization methods neglect safety in the search process. In this paper, we propose High-dimensional Safe Bayesian Optimization with local optimistic exploration (HdSafeBO), a novel approach designed to handle high-dimensional sampling problems under probabilistic safety constraints. We introduce a local optimistic strategy to efficiently and safely optimize the objective function, providing a probabilistic safety guarantee and a cumulative safety violation bound. Through the use of isometric embedding, HdSafeBO addresses problems ranging from a few hundred to several thousand dimensions while maintaining safety guarantees. To our knowledge, HdSafeBO is the first algorithm capable of optimizing the control of high-dimensional musculoskeletal systems with high safety probability. We also demonstrate the real-world applicability of HdSafeBO through its use in the safe online optimization of neural stimulation induced human motion control.
Related papers
- Safe Time-Varying Optimization based on Gaussian Processes with Spatio-Temporal Kernel [4.586346034304039]
TVSafeOpt is an algorithm for time-varying optimization problems with unknown reward and safety functions.
TVSafeOpt is capable of safely tracking a time-varying safe region without need for explicit change detection.
We show that TVSafeOpt compares favorably against SafeOpt on synthetic data, both regarding safety and optimality.
arXiv Detail & Related papers (2024-09-26T16:09:19Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Meta-Learning Priors for Safe Bayesian Optimization [72.8349503901712]
We build on a meta-learning algorithm, F-PACOH, capable of providing reliable uncertainty quantification in settings of data scarcity.
As core contribution, we develop a novel framework for choosing safety-compliant priors in a data-riven manner.
On benchmark functions and a high-precision motion system, we demonstrate that our meta-learned priors accelerate the convergence of safe BO approaches.
arXiv Detail & Related papers (2022-10-03T08:38:38Z) - Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - GoSafeOpt: Scalable Safe Exploration for Global Optimization of
Dynamical Systems [75.22958991597069]
This work proposes GoSafeOpt as the first algorithm that can safely discover globally optimal policies for high-dimensional systems.
We demonstrate the superiority of GoSafeOpt over competing model-free safe learning methods on a robot arm.
arXiv Detail & Related papers (2022-01-24T10:05:44Z) - Safe Policy Optimization with Local Generalized Linear Function
Approximations [17.84511819022308]
Existing safe exploration methods guaranteed safety under the assumption of regularity.
We propose a novel algorithm, SPO-LF, that optimize an agent's policy while learning the relation between a locally available feature obtained by sensors and environmental reward/safety.
We experimentally show that our algorithm is 1) more efficient in terms of sample complexity and computational cost and 2) more applicable to large-scale problems than previous safe RL methods with theoretical guarantees.
arXiv Detail & Related papers (2021-11-09T00:47:50Z) - GoSafe: Globally Optimal Safe Robot Learning [11.77348161331335]
SafeOpt is an efficient Bayesian optimization algorithm that can learn policies while guaranteeing safety with high probability.
We extend this method by exploring outside the initial safe area while still guaranteeing safety with high probability.
We derive conditions for guaranteed convergence to the global optimum and validate GoSafe in hardware experiments.
arXiv Detail & Related papers (2021-05-27T16:27:47Z) - Chance-Constrained Trajectory Optimization for Safe Exploration and
Learning of Nonlinear Systems [81.7983463275447]
Learning-based control algorithms require data collection with abundant supervision for training.
We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
arXiv Detail & Related papers (2020-05-09T05:57:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.