Stacked Universal Successor Feature Approximators for Safety in Reinforcement Learning
- URL: http://arxiv.org/abs/2409.04641v1
- Date: Fri, 6 Sep 2024 22:20:07 GMT
- Title: Stacked Universal Successor Feature Approximators for Safety in Reinforcement Learning
- Authors: Ian Cannon, Washington Garcia, Thomas Gresavage, Joseph Saurine, Ian Leong, Jared Culbertson,
- Abstract summary: We investigate the utility of a stacked, continuous-control variation of universal successor feature approximation (USFA) adapted for soft actor-critic (SAC)
Our method improves performance on secondary objectives compared to SAC baselines using an intervening secondary controller such as a runtime assurance (RTA) controller.
- Score: 1.2534672170380357
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Real-world problems often involve complex objective structures that resist distillation into reinforcement learning environments with a single objective. Operation costs must be balanced with multi-dimensional task performance and end-states' effects on future availability, all while ensuring safety for other agents in the environment and the reinforcement learning agent itself. System redundancy through secondary backup controllers has proven to be an effective method to ensure safety in real-world applications where the risk of violating constraints is extremely high. In this work, we investigate the utility of a stacked, continuous-control variation of universal successor feature approximation (USFA) adapted for soft actor-critic (SAC) and coupled with a suite of secondary safety controllers, which we call stacked USFA for safety (SUSFAS). Our method improves performance on secondary objectives compared to SAC baselines using an intervening secondary controller such as a runtime assurance (RTA) controller.
Related papers
- Safety through Permissibility: Shield Construction for Fast and Safe Reinforcement Learning [57.84059344739159]
"Shielding" is a popular technique to enforce safety inReinforcement Learning (RL)
We propose a new permissibility-based framework to deal with safety and shield construction.
arXiv Detail & Related papers (2024-05-29T18:00:21Z) - Leveraging Approximate Model-based Shielding for Probabilistic Safety
Guarantees in Continuous Environments [63.053364805943026]
We extend the approximate model-based shielding framework to the continuous setting.
In particular we use Safety Gym as our test-bed, allowing for a more direct comparison of AMBS with popular constrained RL algorithms.
arXiv Detail & Related papers (2024-02-01T17:55:08Z) - Safe Deep Policy Adaptation [7.2747306035142225]
Policy adaptation based on reinforcement learning (RL) offers versatility and generalizability but presents safety and robustness challenges.
We propose SafeDPA, a novel RL and control framework that simultaneously tackles the problems of policy adaptation and safe reinforcement learning.
We provide theoretical safety guarantees of SafeDPA and show the robustness of SafeDPA against learning errors and extra perturbations.
arXiv Detail & Related papers (2023-10-08T00:32:59Z) - Searching for Optimal Runtime Assurance via Reachability and
Reinforcement Learning [2.422636931175853]
runtime assurance system (RTA) for a given plant enables the exercise of an untrusted or experimental controller while assuring safety with a backup controller.
Existing RTA design strategies are well-known to be overly conservative and, in principle, can lead to safety violations.
In this paper, we formulate the optimal RTA design problem and present a new approach for solving it.
arXiv Detail & Related papers (2023-10-06T14:45:57Z) - Adaptive Aggregation for Safety-Critical Control [3.1692938090731584]
We propose an adaptive aggregation framework for safety-critical control.
Our algorithm can achieve fewer safety violations while showing better data efficiency compared with several baselines.
arXiv Detail & Related papers (2023-02-07T16:53:33Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - ISAACS: Iterative Soft Adversarial Actor-Critic for Safety [0.9217021281095907]
This work introduces a novel approach enabling scalable synthesis of robust safety-preserving controllers for robotic systems.
A safety-seeking fallback policy is co-trained with an adversarial "disturbance" agent that aims to invoke the worst-case realization of model error.
While the learned control policy does not intrinsically guarantee safety, it is used to construct a real-time safety filter.
arXiv Detail & Related papers (2022-12-06T18:53:34Z) - Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z) - Safe Reinforcement Learning via Confidence-Based Filters [78.39359694273575]
We develop a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard reinforcement learning techniques.
We provide formal safety guarantees, and empirically demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-07-04T11:43:23Z) - Safe Model-Based Reinforcement Learning Using Robust Control Barrier
Functions [43.713259595810854]
An increasingly common approach to address safety involves the addition of a safety layer that projects the RL actions onto a safe set of actions.
In this paper, we frame safety as a differentiable robust-control-barrier-function layer in a model-based RL framework.
arXiv Detail & Related papers (2021-10-11T17:00:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.