Related papers: Automatically Learning Fallback Strategies with Model-Free Reinforcement Learning in Safety-Critical Driving Scenarios

Automatically Learning Fallback Strategies with Model-Free Reinforcement Learning in Safety-Critical Driving Scenarios

URL: http://arxiv.org/abs/2204.05196v1
Date: Mon, 11 Apr 2022 15:34:49 GMT
Title: Automatically Learning Fallback Strategies with Model-Free Reinforcement Learning in Safety-Critical Driving Scenarios
Authors: Ugo Lecerf, Christelle Yemdji-Tchassi, S\'ebastien Aubert, Pietro Michiardi
Abstract summary: We present a principled approach for a model-free Reinforcement Learning (RL) agent to capture multiple modes of behaviour in an environment. We introduce an extra pseudo-reward term to the reward model, to encourage exploration to areas of state-space different from areas privileged by the optimal policy. We show that we are able to learn useful policies that would have otherwise been missed out on during training, and unavailable to use when executing the control algorithm.
Score: 9.761912672523977
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: When learning to behave in a stochastic environment where safety is critical, such as driving a vehicle in traffic, it is natural for human drivers to plan fallback strategies as a backup to use if ever there is an unexpected change in the environment. Knowing to expect the unexpected, and planning for such outcomes, increases our capability for being robust to unseen scenarios and may help prevent catastrophic failures. Control of Autonomous Vehicles (AVs) has a particular interest in knowing when and how to use fallback strategies in the interest of safety. Due to imperfect information available to an AV about its environment, it is important to have alternate strategies at the ready which might not have been deduced from the original training data distribution. In this paper we present a principled approach for a model-free Reinforcement Learning (RL) agent to capture multiple modes of behaviour in an environment. We introduce an extra pseudo-reward term to the reward model, to encourage exploration to areas of state-space different from areas privileged by the optimal policy. We base this reward term on a distance metric between the trajectories of agents, in order to force policies to focus on different areas of state-space than the initial exploring agent. Throughout the paper, we refer to this particular training paradigm as learning fallback strategies. We apply this method to an autonomous driving scenario, and show that we are able to learn useful policies that would have otherwise been missed out on during training, and unavailable to use when executing the control algorithm.

Related papers

Reinforcement Learning for Decision-Level Interception Prioritization in Drone Swarm Defense [56.47577824219207]
We present a case study demonstrating the practical advantages of reinforcement learning in addressing this challenge.<n>We introduce a high-fidelity simulation environment that captures realistic operational constraints.<n>Agent learns to coordinate multiple effectors for optimal interception prioritization.<n>We evaluate the learned policy against a handcrafted rule-based baseline across hundreds of simulated attack scenarios.
arXiv Detail & Related papers (2025-08-01T13:55:39Z)
CuRLA: Curriculum Learning Based Deep Reinforcement Learning for Autonomous Driving [1.188383832081829]
Deep Reinforcement Learning (DRL) agents address this by learning from experience and maximizing rewards. We propose a method that combines DRL with Curriculum Learning for autonomous driving.
arXiv Detail & Related papers (2025-01-09T05:45:03Z)
RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes [57.319845580050924]
We propose a reinforcement learning framework that combines risk-sensitive control with an adaptive action space curriculum. We show that our algorithm is capable of learning high-speed policies for a real-world off-road driving task.
arXiv Detail & Related papers (2024-05-07T23:32:36Z)
Towards Optimal Head-to-head Autonomous Racing with Curriculum Reinforcement Learning [22.69532642800264]
We propose a head-to-head racing environment for reinforcement learning which accurately models vehicle dynamics. We also propose a control barrier function-based safe reinforcement learning algorithm to enforce the safety of the agent.
arXiv Detail & Related papers (2023-08-25T17:05:41Z)
Robust Driving Policy Learning with Guided Meta Reinforcement Learning [49.860391298275616]
We introduce an efficient method to train diverse driving policies for social vehicles as a single meta-policy. By randomizing the interaction-based reward functions of social vehicles, we can generate diverse objectives and efficiently train the meta-policy. We propose a training strategy to enhance the robustness of the ego vehicle's driving policy using the environment where social vehicles are controlled by the learned meta-policy.
arXiv Detail & Related papers (2023-07-19T17:42:36Z)
Safety Correction from Baseline: Towards the Risk-aware Policy in Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent. Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control. The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z)
Minimizing Safety Interference for Safe and Comfortable Automated Driving with Distributional Reinforcement Learning [3.923354711049903]
We propose a distributional reinforcement learning framework to learn adaptive policies that can tune their level of conservativity at run-time based on the desired comfort and utility. We show that our algorithm learns policies that can still drive reliable when the perception noise is two times higher than the training configuration for automated merging and crossing at occluded intersections.
arXiv Detail & Related papers (2021-07-15T13:36:55Z)
Driving-Policy Adaptive Safeguard for Autonomous Vehicles Using Reinforcement Learning [19.71676985220504]
This paper proposes a driving-policy adaptive safeguard (DPAS) design, including a collision avoidance strategy and an activation function. The driving-policy adaptive activation function should dynamically assess current driving policy risk and kick in when an urgent threat is detected. The results are calibrated by naturalistic driving data and show that the proposed safeguard reduces the collision rate significantly without introducing more interventions.
arXiv Detail & Related papers (2020-12-02T08:01:53Z)
Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings [129.80279257258098]
Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous. We propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments. We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk.
arXiv Detail & Related papers (2020-08-15T01:40:59Z)
Safe Reinforcement Learning via Curriculum Induction [94.67835258431202]
In safety-critical applications, autonomous agents may need to learn in an environment where mistakes can be very costly. Existing safe reinforcement learning methods make an agent rely on priors that let it avoid dangerous situations. This paper presents an alternative approach inspired by human teaching, where an agent learns under the supervision of an automatic instructor.
arXiv Detail & Related papers (2020-06-22T10:48:17Z)
Improving Generalization of Reinforcement Learning with Minimax Distributional Soft Actor-Critic [11.601356612579641]
This paper introduces the minimax formulation and distributional framework to improve the generalization ability of RL algorithms. We implement our method on the decision-making tasks of autonomous vehicles at intersections and test the trained policy in distinct environments.
arXiv Detail & Related papers (2020-02-13T14:09:22Z)
Intelligent Roundabout Insertion using Deep Reinforcement Learning [68.8204255655161]
We present a maneuver planning module able to negotiate the entering in busy roundabouts. The proposed module is based on a neural network trained to predict when and how entering the roundabout throughout the whole duration of the maneuver.
arXiv Detail & Related papers (2020-01-03T11:16:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.