Safe Guaranteed Exploration for Non-linear Systems
- URL: http://arxiv.org/abs/2402.06562v1
- Date: Fri, 9 Feb 2024 17:26:26 GMT
- Title: Safe Guaranteed Exploration for Non-linear Systems
- Authors: Manish Prajapat, Johannes K\"ohler, Matteo Turchetta, Andreas Krause,
Melanie N. Zeilinger
- Abstract summary: We propose a novel safe guaranteed exploration framework using optimal control, which achieves first-of-its-kind results.
Based on this framework we propose an efficient algorithm, SageMPC, SAfe Guaranteed Exploration using Model Predictive Control.
We demonstrate safe efficient exploration in challenging unknown environments using SageMPC with a car model.
- Score: 44.2908666969021
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Safely exploring environments with a-priori unknown constraints is a
fundamental challenge that restricts the autonomy of robots. While safety is
paramount, guarantees on sufficient exploration are also crucial for ensuring
autonomous task completion. To address these challenges, we propose a novel
safe guaranteed exploration framework using optimal control, which achieves
first-of-its-kind results: guaranteed exploration for non-linear systems with
finite time sample complexity bounds, while being provably safe with
arbitrarily high probability. The framework is general and applicable to many
real-world scenarios with complex non-linear dynamics and unknown domains.
Based on this framework we propose an efficient algorithm, SageMPC, SAfe
Guaranteed Exploration using Model Predictive Control. SageMPC improves
efficiency by incorporating three techniques: i) exploiting a Lipschitz bound,
ii) goal-directed exploration, and iii) receding horizon style re-planning, all
while maintaining the desired sample complexity, safety and exploration
guarantees of the framework. Lastly, we demonstrate safe efficient exploration
in challenging unknown environments using SageMPC with a car model.
Related papers
- Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z) - Sample-efficient Safe Learning for Online Nonlinear Control with Control
Barrier Functions [35.9713619595494]
Reinforcement Learning and continuous nonlinear control have been successfully deployed in multiple domains of complicated sequential decision-making tasks.
Given the exploration nature of the learning process and the presence of model uncertainty, it is challenging to apply them to safety-critical control tasks.
We propose a emphprovably efficient episodic safe learning framework for online control tasks.
arXiv Detail & Related papers (2022-07-29T00:54:35Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - Safe Exploration Incurs Nearly No Additional Sample Complexity for
Reward-free RL [43.672794342894946]
Reward-free reinforcement learning (RF-RL) relies on random action-taking to explore the unknown environment without any reward feedback information.
It remains unclear how such safe exploration requirement would affect the corresponding sample complexity in order to achieve the desired optimality of the obtained policy in planning.
We propose a unified Safe reWard-frEe ExploraTion (SWEET) framework, and develop algorithms coined Tabular-SWEET and Low-rank-SWEET, respectively.
arXiv Detail & Related papers (2022-06-28T15:00:45Z) - Provably Safe PAC-MDP Exploration Using Analogies [87.41775218021044]
Key challenge in applying reinforcement learning to safety-critical domains is understanding how to balance exploration and safety.
We propose Analogous Safe-state Exploration (ASE), an algorithm for provably safe exploration in MDPs with unknown, dynamics.
Our method exploits analogies between state-action pairs to safely learn a near-optimal policy in a PAC-MDP sense.
arXiv Detail & Related papers (2020-07-07T15:50:50Z) - Verifiably Safe Exploration for End-to-End Reinforcement Learning [17.401496872603943]
This paper contributes a first approach toward enforcing formal safety constraints on end-to-end policies with visual inputs.
It is evaluated on a novel benchmark that emphasizes the challenge of safely exploring in the presence of hard constraints.
arXiv Detail & Related papers (2020-07-02T16:12:20Z) - Chance-Constrained Trajectory Optimization for Safe Exploration and
Learning of Nonlinear Systems [81.7983463275447]
Learning-based control algorithms require data collection with abundant supervision for training.
We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
arXiv Detail & Related papers (2020-05-09T05:57:43Z) - Online Mapping and Motion Planning under Uncertainty for Safe Navigation
in Unknown Environments [3.2296078260106174]
This manuscript proposes an uncertainty-based framework for mapping and planning feasible motions online with probabilistic safety-guarantees.
The proposed approach deals with the motion, probabilistic safety, and online computation constraints by: (i) mapping the surroundings to build an uncertainty-aware representation of the environment, and (ii) iteratively (re)planning to goal that are kinodynamically feasible and probabilistically safe through a multi-layered sampling-based planner in the belief space.
arXiv Detail & Related papers (2020-04-26T08:53:37Z) - Safe reinforcement learning for probabilistic reachability and safety
specifications: A Lyapunov-based approach [2.741266294612776]
We propose a model-free safety specification method that learns the maximal probability of safe operation.
Our approach constructs a Lyapunov function with respect to a safe policy to restrain each policy improvement stage.
It yields a sequence of safe policies that determine the range of safe operation, called the safe set.
arXiv Detail & Related papers (2020-02-24T09:20:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.