Automatically Learning Fallback Strategies with Model-Free Reinforcement
Learning in Safety-Critical Driving Scenarios
- URL: http://arxiv.org/abs/2204.05196v1
- Date: Mon, 11 Apr 2022 15:34:49 GMT
- Title: Automatically Learning Fallback Strategies with Model-Free Reinforcement
Learning in Safety-Critical Driving Scenarios
- Authors: Ugo Lecerf, Christelle Yemdji-Tchassi, S\'ebastien Aubert, Pietro
Michiardi
- Abstract summary: We present a principled approach for a model-free Reinforcement Learning (RL) agent to capture multiple modes of behaviour in an environment.
We introduce an extra pseudo-reward term to the reward model, to encourage exploration to areas of state-space different from areas privileged by the optimal policy.
We show that we are able to learn useful policies that would have otherwise been missed out on during training, and unavailable to use when executing the control algorithm.
- Score: 9.761912672523977
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When learning to behave in a stochastic environment where safety is critical,
such as driving a vehicle in traffic, it is natural for human drivers to plan
fallback strategies as a backup to use if ever there is an unexpected change in
the environment. Knowing to expect the unexpected, and planning for such
outcomes, increases our capability for being robust to unseen scenarios and may
help prevent catastrophic failures. Control of Autonomous Vehicles (AVs) has a
particular interest in knowing when and how to use fallback strategies in the
interest of safety. Due to imperfect information available to an AV about its
environment, it is important to have alternate strategies at the ready which
might not have been deduced from the original training data distribution.
In this paper we present a principled approach for a model-free Reinforcement
Learning (RL) agent to capture multiple modes of behaviour in an environment.
We introduce an extra pseudo-reward term to the reward model, to encourage
exploration to areas of state-space different from areas privileged by the
optimal policy. We base this reward term on a distance metric between the
trajectories of agents, in order to force policies to focus on different areas
of state-space than the initial exploring agent. Throughout the paper, we refer
to this particular training paradigm as learning fallback strategies.
We apply this method to an autonomous driving scenario, and show that we are
able to learn useful policies that would have otherwise been missed out on
during training, and unavailable to use when executing the control algorithm.
Related papers
- RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes [57.319845580050924]
We propose a reinforcement learning framework that combines risk-sensitive control with an adaptive action space curriculum.
We show that our algorithm is capable of learning high-speed policies for a real-world off-road driving task.
arXiv Detail & Related papers (2024-05-07T23:32:36Z) - Towards Optimal Head-to-head Autonomous Racing with Curriculum
Reinforcement Learning [22.69532642800264]
We propose a head-to-head racing environment for reinforcement learning which accurately models vehicle dynamics.
We also propose a control barrier function-based safe reinforcement learning algorithm to enforce the safety of the agent.
arXiv Detail & Related papers (2023-08-25T17:05:41Z) - Robust Driving Policy Learning with Guided Meta Reinforcement Learning [49.860391298275616]
We introduce an efficient method to train diverse driving policies for social vehicles as a single meta-policy.
By randomizing the interaction-based reward functions of social vehicles, we can generate diverse objectives and efficiently train the meta-policy.
We propose a training strategy to enhance the robustness of the ego vehicle's driving policy using the environment where social vehicles are controlled by the learned meta-policy.
arXiv Detail & Related papers (2023-07-19T17:42:36Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - Minimizing Safety Interference for Safe and Comfortable Automated
Driving with Distributional Reinforcement Learning [3.923354711049903]
We propose a distributional reinforcement learning framework to learn adaptive policies that can tune their level of conservativity at run-time based on the desired comfort and utility.
We show that our algorithm learns policies that can still drive reliable when the perception noise is two times higher than the training configuration for automated merging and crossing at occluded intersections.
arXiv Detail & Related papers (2021-07-15T13:36:55Z) - Driving-Policy Adaptive Safeguard for Autonomous Vehicles Using
Reinforcement Learning [19.71676985220504]
This paper proposes a driving-policy adaptive safeguard (DPAS) design, including a collision avoidance strategy and an activation function.
The driving-policy adaptive activation function should dynamically assess current driving policy risk and kick in when an urgent threat is detected.
The results are calibrated by naturalistic driving data and show that the proposed safeguard reduces the collision rate significantly without introducing more interventions.
arXiv Detail & Related papers (2020-12-02T08:01:53Z) - Cautious Adaptation For Reinforcement Learning in Safety-Critical
Settings [129.80279257258098]
Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous.
We propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments.
We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk.
arXiv Detail & Related papers (2020-08-15T01:40:59Z) - Safe Reinforcement Learning via Curriculum Induction [94.67835258431202]
In safety-critical applications, autonomous agents may need to learn in an environment where mistakes can be very costly.
Existing safe reinforcement learning methods make an agent rely on priors that let it avoid dangerous situations.
This paper presents an alternative approach inspired by human teaching, where an agent learns under the supervision of an automatic instructor.
arXiv Detail & Related papers (2020-06-22T10:48:17Z) - Improving Generalization of Reinforcement Learning with Minimax
Distributional Soft Actor-Critic [11.601356612579641]
This paper introduces the minimax formulation and distributional framework to improve the generalization ability of RL algorithms.
We implement our method on the decision-making tasks of autonomous vehicles at intersections and test the trained policy in distinct environments.
arXiv Detail & Related papers (2020-02-13T14:09:22Z) - Intelligent Roundabout Insertion using Deep Reinforcement Learning [68.8204255655161]
We present a maneuver planning module able to negotiate the entering in busy roundabouts.
The proposed module is based on a neural network trained to predict when and how entering the roundabout throughout the whole duration of the maneuver.
arXiv Detail & Related papers (2020-01-03T11:16:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.