Handling Long-Term Safety and Uncertainty in Safe Reinforcement Learning
- URL: http://arxiv.org/abs/2409.12045v2
- Date: Mon, 23 Sep 2024 12:42:32 GMT
- Title: Handling Long-Term Safety and Uncertainty in Safe Reinforcement Learning
- Authors: Jonas Günster, Puze Liu, Jan Peters, Davide Tateo,
- Abstract summary: Safety is one of the key issues preventing the deployment of reinforcement learning techniques in real-world robots.
In this paper, we bridge the gap by extending the safe exploration method, ATACOM, with learnable constraints.
- Score: 17.856459823003277
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Safety is one of the key issues preventing the deployment of reinforcement learning techniques in real-world robots. While most approaches in the Safe Reinforcement Learning area do not require prior knowledge of constraints and robot kinematics and rely solely on data, it is often difficult to deploy them in complex real-world settings. Instead, model-based approaches that incorporate prior knowledge of the constraints and dynamics into the learning framework have proven capable of deploying the learning algorithm directly on the real robot. Unfortunately, while an approximated model of the robot dynamics is often available, the safety constraints are task-specific and hard to obtain: they may be too complicated to encode analytically, too expensive to compute, or it may be difficult to envision a priori the long-term safety requirements. In this paper, we bridge this gap by extending the safe exploration method, ATACOM, with learnable constraints, with a particular focus on ensuring long-term safety and handling of uncertainty. Our approach is competitive or superior to state-of-the-art methods in final performance while maintaining safer behavior during training.
Related papers
- RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes [57.319845580050924]
We propose a reinforcement learning framework that combines risk-sensitive control with an adaptive action space curriculum.
We show that our algorithm is capable of learning high-speed policies for a real-world off-road driving task.
arXiv Detail & Related papers (2024-05-07T23:32:36Z) - Learning Control Barrier Functions and their application in Reinforcement Learning: A Survey [11.180978323594822]
Reinforcement learning is a powerful technique for developing new robot behaviors.
It aims to incorporate safety considerations, enabling faster transfer to real robots and facilitating lifelong learning.
One promising approach within safe reinforcement learning is the use of control barrier functions.
arXiv Detail & Related papers (2024-04-22T22:52:14Z) - Safe Reinforcement Learning on the Constraint Manifold: Theory and Applications [21.98309272057848]
We show how we can impose complex safety constraints on learning-based robotics systems in a principled manner.
Our approach is based on the concept of the Constraint Manifold, representing the set of safe robot configurations.
We demonstrate the method's effectiveness in a real-world Robot Air Hockey task.
arXiv Detail & Related papers (2024-04-13T20:55:15Z) - Evaluation of Safety Constraints in Autonomous Navigation with Deep
Reinforcement Learning [62.997667081978825]
We compare two learnable navigation policies: safe and unsafe.
The safe policy takes the constraints into the account, while the other does not.
We show that the safe policy is able to generate trajectories with more clearance (distance to the obstacles) and makes less collisions while training without sacrificing the overall performance.
arXiv Detail & Related papers (2023-07-27T01:04:57Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - An Empirical Analysis of the Use of Real-Time Reachability for the
Safety Assurance of Autonomous Vehicles [7.1169864450668845]
We propose using a real-time reachability algorithm for the implementation of the simplex architecture to assure the safety of a 1/10 scale open source autonomous vehicle platform.
In our approach, the need to analyze an underlying controller is abstracted away, instead focusing on the effects of the controller's decisions on the system's future states.
arXiv Detail & Related papers (2022-05-03T11:12:29Z) - Safe Learning in Robotics: From Learning-Based Control to Safe
Reinforcement Learning [3.9258421820410225]
We review the recent advances made in using machine learning to achieve safe decision making under uncertainties.
Our review includes: learning-based control approaches that safely improve performance by learning the uncertain dynamics.
We highlight some of the open challenges that will drive the field of robot learning in the coming years.
arXiv Detail & Related papers (2021-08-13T14:22:02Z) - Learning Barrier Certificates: Towards Safe Reinforcement Learning with
Zero Training-time Violations [64.39401322671803]
This paper explores the possibility of safe RL algorithms with zero training-time safety violations.
We propose an algorithm, Co-trained Barrier Certificate for Safe RL (CRABS), which iteratively learns barrier certificates, dynamics models, and policies.
arXiv Detail & Related papers (2021-08-04T04:59:05Z) - Learning to be Safe: Deep RL with a Safety Critic [72.00568333130391]
A natural first approach toward safe RL is to manually specify constraints on the policy's behavior.
We propose to learn how to be safe in one set of tasks and environments, and then use that learned intuition to constrain future behaviors.
arXiv Detail & Related papers (2020-10-27T20:53:20Z) - Safe reinforcement learning for probabilistic reachability and safety
specifications: A Lyapunov-based approach [2.741266294612776]
We propose a model-free safety specification method that learns the maximal probability of safe operation.
Our approach constructs a Lyapunov function with respect to a safe policy to restrain each policy improvement stage.
It yields a sequence of safe policies that determine the range of safe operation, called the safe set.
arXiv Detail & Related papers (2020-02-24T09:20:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.