Safe Active Dynamics Learning and Control: A Sequential
Exploration-Exploitation Framework
- URL: http://arxiv.org/abs/2008.11700v4
- Date: Wed, 16 Feb 2022 03:18:45 GMT
- Title: Safe Active Dynamics Learning and Control: A Sequential
Exploration-Exploitation Framework
- Authors: Thomas Lew, Apoorva Sharma, James Harrison, Andrew Bylard, Marco
Pavone
- Abstract summary: We propose a theoretically-justified approach to maintaining safety in the presence of dynamics uncertainty.
Our framework guarantees the high-probability satisfaction of all constraints at all times jointly.
This theoretical analysis also motivates two regularizers of last-layer meta-learning models that improve online adaptation capabilities.
- Score: 30.58186749790728
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Safe deployment of autonomous robots in diverse scenarios requires agents
that are capable of efficiently adapting to new environments while satisfying
constraints. In this work, we propose a practical and theoretically-justified
approach to maintaining safety in the presence of dynamics uncertainty. Our
approach leverages Bayesian meta-learning with last-layer adaptation. The
expressiveness of neural-network features trained offline, paired with
efficient last-layer online adaptation, enables the derivation of tight
confidence sets which contract around the true dynamics as the model adapts
online. We exploit these confidence sets to plan trajectories that guarantee
the safety of the system. Our approach handles problems with high dynamics
uncertainty, where reaching the goal safely is potentially initially
infeasible, by first \textit{exploring} to gather data and reduce uncertainty,
before autonomously \textit{exploiting} the acquired information to safely
perform the task. Under reasonable assumptions, we prove that our framework
guarantees the high-probability satisfaction of all constraints at all times
jointly, i.e. over the total task duration. This theoretical analysis also
motivates two regularizers of last-layer meta-learning models that improve
online adaptation capabilities as well as performance by reducing the size of
the confidence sets. We extensively demonstrate our approach in simulation and
on hardware.
Related papers
- FOSP: Fine-tuning Offline Safe Policy through World Models [3.7971075341023526]
Model-based Reinforcement Learning (RL) has shown its high training efficiency and capability of handling high-dimensional tasks.
However, prior works still pose safety challenges due to the online exploration in real-world deployment.
In this paper, we aim to further enhance safety during the deployment stage for vision-based robotic tasks by fine-tuning an offline-trained policy.
arXiv Detail & Related papers (2024-07-06T03:22:57Z) - SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries [94.84458417662407]
We introduce SAFE-SIM, a controllable closed-loop safety-critical simulation framework.
Our approach yields two distinct advantages: 1) generating realistic long-tail safety-critical scenarios that closely reflect real-world conditions, and 2) providing controllable adversarial behavior for more comprehensive and interactive evaluations.
We validate our framework empirically using the nuScenes and nuPlan datasets across multiple planners, demonstrating improvements in both realism and controllability.
arXiv Detail & Related papers (2023-12-31T04:14:43Z) - Constrained Meta-Reinforcement Learning for Adaptable Safety Guarantee
with Differentiable Convex Programming [4.825619788907192]
This paper studies the unique challenges of ensuring safety in non-stationary environments by solving constrained problems through the lens of the meta-learning approach (learning-to-learn)
We first employ successive convex-constrained policy updates across multiple tasks with differentiable convexprogramming, which allows meta-learning in constrained scenarios by enabling end-to-end differentiation.
arXiv Detail & Related papers (2023-12-15T21:55:43Z) - Probabilistic Reach-Avoid for Bayesian Neural Networks [71.67052234622781]
We show that an optimal synthesis algorithm can provide more than a four-fold increase in the number of certifiable states.
The algorithm is able to provide more than a three-fold increase in the average guaranteed reach-avoid probability.
arXiv Detail & Related papers (2023-10-03T10:52:21Z) - Risk-Averse Model Uncertainty for Distributionally Robust Safe
Reinforcement Learning [3.9821399546174825]
We introduce a deep reinforcement learning framework for safe decision making in uncertain environments.
We provide robustness guarantees for this framework by showing it is equivalent to a specific class of distributionally robust safe reinforcement learning problems.
In experiments on continuous control tasks with safety constraints, we demonstrate that our framework produces robust performance and safety at deployment time across a range of perturbed test environments.
arXiv Detail & Related papers (2023-01-30T00:37:06Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Meta-Learning Priors for Safe Bayesian Optimization [72.8349503901712]
We build on a meta-learning algorithm, F-PACOH, capable of providing reliable uncertainty quantification in settings of data scarcity.
As core contribution, we develop a novel framework for choosing safety-compliant priors in a data-riven manner.
On benchmark functions and a high-precision motion system, we demonstrate that our meta-learned priors accelerate the convergence of safe BO approaches.
arXiv Detail & Related papers (2022-10-03T08:38:38Z) - Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - ProBF: Learning Probabilistic Safety Certificates with Barrier Functions [31.203344483485843]
The control barrier function is a useful tool to guarantee safety if we have access to the ground-truth system dynamics.
In practice, we have inaccurate knowledge of the system dynamics, which can lead to unsafe behaviors.
We show the efficacy of this method through experiments on Segway and Quadrotor simulations.
arXiv Detail & Related papers (2021-12-22T20:18:18Z) - Safely Learning Dynamical Systems from Short Trajectories [12.184674552836414]
A fundamental challenge in learning to control an unknown dynamical system is to reduce model uncertainty by making measurements while maintaining safety.
We formulate a mathematical definition of what it means to safely learn a dynamical system by sequentially deciding where to initialize the next trajectory.
We present a linear programming-based algorithm that either safely recovers the true dynamics from trajectories of length one, or certifies that safe learning is impossible.
arXiv Detail & Related papers (2020-11-24T18:06:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.