Related papers: MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance

MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance

URL: http://arxiv.org/abs/2112.03575v1
Date: Tue, 7 Dec 2021 08:57:35 GMT
Title: MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance
Authors: Michael Luo, Ashwin Balakrishna, Brijen Thananjeyan, Suraj Nair, Julian Ibarz, Jie Tan, Chelsea Finn, Ion Stoica, Ken Goldberg
Abstract summary: Recent work learns risk measures which measure the probability of violating constraints, which can then be used to enable safety. We cast safe exploration as an offline meta-RL problem, where the objective is to leverage examples of safe and unsafe behavior across a range of environments. We then propose MEta-learning for Safe Adaptation (MESA), an approach for meta-learning Simulation a risk measure for safe RL.
Score: 73.3242641337305
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Safe exploration is critical for using reinforcement learning (RL) in risk-sensitive environments. Recent work learns risk measures which measure the probability of violating constraints, which can then be used to enable safety. However, learning such risk measures requires significant interaction with the environment, resulting in excessive constraint violations during learning. Furthermore, these measures are not easily transferable to new environments. We cast safe exploration as an offline meta-RL problem, where the objective is to leverage examples of safe and unsafe behavior across a range of environments to quickly adapt learned risk measures to a new environment with previously unseen dynamics. We then propose MEta-learning for Safe Adaptation (MESA), an approach for meta-learning a risk measure for safe RL. Simulation experiments across 5 continuous control domains suggest that MESA can leverage offline data from a range of different environments to reduce constraint violations in unseen environments by up to a factor of 2 while maintaining task performance. See https://tinyurl.com/safe-meta-rl for code and supplementary material.

Related papers

Graphormer-Guided Task Planning: Beyond Static Rules with LLM Safety Perception [4.424170214926035]
We propose a risk-aware task planning framework that combines large language models with structured safety modeling. Our approach constructs a dynamic-semantic safety graph, capturing spatial and contextual risk factors. Unlike existing methods that rely on predefined safety constraints, our framework introduces a context-aware risk perception module.
arXiv Detail & Related papers (2025-03-10T02:43:54Z)
Learning to explore when mistakes are not allowed [1.179778723980276]
We propose a method that enables agents to learn goal-conditioned behaviors that explore without the risk of making harmful mistakes. Exploration without risks can seem paradoxical, but environment dynamics are often uniform in space. We evaluate our method in simulated environments and demonstrate that it not only provides substantial coverage of the goal space but also reduces the occurrence of mistakes to a minimum.
arXiv Detail & Related papers (2025-02-19T15:11:51Z)
ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning [48.536695794883826]
We present ActSafe, a novel model-based RL algorithm for safe and efficient exploration. We show that ActSafe guarantees safety during learning while also obtaining a near-optimal policy in finite time. In addition, we propose a practical variant of ActSafe that builds on latest model-based RL advancements.
arXiv Detail & Related papers (2024-10-12T10:46:02Z)
A Safe Exploration Strategy for Model-free Task Adaptation in Safety-constrained Grid Environments [2.5037136114892267]
In safety-constrained environments, utilizing unsupervised exploration or a non-optimal policy may lead the agent to undesirable states. We introduce a new exploration framework for navigating the grid environments that enables model-free agents to interact with the environment while adhering to safety constraints.
arXiv Detail & Related papers (2024-08-02T04:09:30Z)
Probabilistic Counterexample Guidance for Safer Reinforcement Learning (Extended Version) [1.279257604152629]
Safe exploration aims at addressing the limitations of Reinforcement Learning (RL) in safety-critical scenarios. Several methods exist to incorporate external knowledge or to use sensor data to limit the exploration of unsafe states. In this paper, we target the problem of safe exploration by guiding the training with counterexamples of the safety requirement.
arXiv Detail & Related papers (2023-07-10T22:28:33Z)
Safety Correction from Baseline: Towards the Risk-aware Policy in Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent. Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control. The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z)
Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments [84.3830478851369]
We propose a safe reinforcement learning approach that can jointly learn the environment and optimize the control policy. Our approach can effectively enforce hard safety constraints and significantly outperform CMDP-based baseline methods in system safe rate measured via simulations.
arXiv Detail & Related papers (2022-09-29T20:49:25Z)
Minimizing Safety Interference for Safe and Comfortable Automated Driving with Distributional Reinforcement Learning [3.923354711049903]
We propose a distributional reinforcement learning framework to learn adaptive policies that can tune their level of conservativity at run-time based on the desired comfort and utility. We show that our algorithm learns policies that can still drive reliable when the perception noise is two times higher than the training configuration for automated merging and crossing at occluded intersections.
arXiv Detail & Related papers (2021-07-15T13:36:55Z)
Learning to be Safe: Deep RL with a Safety Critic [72.00568333130391]
A natural first approach toward safe RL is to manually specify constraints on the policy's behavior. We propose to learn how to be safe in one set of tasks and environments, and then use that learned intuition to constrain future behaviors.
arXiv Detail & Related papers (2020-10-27T20:53:20Z)
Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings [129.80279257258098]
Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous. We propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments. We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk.
arXiv Detail & Related papers (2020-08-15T01:40:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.