Cautious Adaptation For Reinforcement Learning in Safety-Critical
Settings
- URL: http://arxiv.org/abs/2008.06622v1
- Date: Sat, 15 Aug 2020 01:40:59 GMT
- Title: Cautious Adaptation For Reinforcement Learning in Safety-Critical
Settings
- Authors: Jesse Zhang, Brian Cheung, Chelsea Finn, Sergey Levine, Dinesh
Jayaraman
- Abstract summary: Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous.
We propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments.
We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk.
- Score: 129.80279257258098
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning (RL) in real-world safety-critical target settings
like urban driving is hazardous, imperiling the RL agent, other agents, and the
environment. To overcome this difficulty, we propose a "safety-critical
adaptation" task setting: an agent first trains in non-safety-critical "source"
environments such as in a simulator, before it adapts to the target environment
where failures carry heavy costs. We propose a solution approach, CARL, that
builds on the intuition that prior experience in diverse environments equips an
agent to estimate risk, which in turn enables relative safety through
risk-averse, cautious adaptation. CARL first employs model-based RL to train a
probabilistic model to capture uncertainty about transition dynamics and
catastrophic states across varied source environments. Then, when exploring a
new safety-critical environment with unknown dynamics, the CARL agent plans to
avoid actions that could lead to catastrophic states. In experiments on car
driving, cartpole balancing, half-cheetah locomotion, and robotic object
manipulation, CARL successfully acquires cautious exploration behaviors,
yielding higher rewards with fewer failures than strong RL adaptation
baselines. Website at https://sites.google.com/berkeley.edu/carl.
Related papers
- Anomalous State Sequence Modeling to Enhance Safety in Reinforcement Learning [0.0]
We propose a safe reinforcement learning (RL) approach that utilizes an anomalous state sequence to enhance RL safety.
In experiments on multiple safety-critical environments including self-driving cars, our solution approach successfully learns safer policies.
arXiv Detail & Related papers (2024-07-29T10:30:07Z) - A Multiplicative Value Function for Safe and Efficient Reinforcement
Learning [131.96501469927733]
We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic.
The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns.
We evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations.
arXiv Detail & Related papers (2023-03-07T18:29:15Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance [73.3242641337305]
Recent work learns risk measures which measure the probability of violating constraints, which can then be used to enable safety.
We cast safe exploration as an offline meta-RL problem, where the objective is to leverage examples of safe and unsafe behavior across a range of environments.
We then propose MEta-learning for Safe Adaptation (MESA), an approach for meta-learning Simulation a risk measure for safe RL.
arXiv Detail & Related papers (2021-12-07T08:57:35Z) - High-level Decisions from a Safe Maneuver Catalog with Reinforcement
Learning for Safe and Cooperative Automated Merging [5.732271870257913]
We propose an efficient RL-based decision-making pipeline for safe and cooperative automated driving in merging scenarios.
The proposed RL agent can efficiently identify cooperative drivers from their vehicle state history and generate interactive maneuvers.
arXiv Detail & Related papers (2021-07-15T15:49:53Z) - Minimizing Safety Interference for Safe and Comfortable Automated
Driving with Distributional Reinforcement Learning [3.923354711049903]
We propose a distributional reinforcement learning framework to learn adaptive policies that can tune their level of conservativity at run-time based on the desired comfort and utility.
We show that our algorithm learns policies that can still drive reliable when the perception noise is two times higher than the training configuration for automated merging and crossing at occluded intersections.
arXiv Detail & Related papers (2021-07-15T13:36:55Z) - Learning to be Safe: Deep RL with a Safety Critic [72.00568333130391]
A natural first approach toward safe RL is to manually specify constraints on the policy's behavior.
We propose to learn how to be safe in one set of tasks and environments, and then use that learned intuition to constrain future behaviors.
arXiv Detail & Related papers (2020-10-27T20:53:20Z) - Safe Reinforcement Learning via Curriculum Induction [94.67835258431202]
In safety-critical applications, autonomous agents may need to learn in an environment where mistakes can be very costly.
Existing safe reinforcement learning methods make an agent rely on priors that let it avoid dangerous situations.
This paper presents an alternative approach inspired by human teaching, where an agent learns under the supervision of an automatic instructor.
arXiv Detail & Related papers (2020-06-22T10:48:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.