Self-Improving Safety Performance of Reinforcement Learning Based
Driving with Black-Box Verification Algorithms
- URL: http://arxiv.org/abs/2210.16575v3
- Date: Sun, 9 Jul 2023 16:42:37 GMT
- Title: Self-Improving Safety Performance of Reinforcement Learning Based
Driving with Black-Box Verification Algorithms
- Authors: Resul Dagdanov, Halil Durmus, Nazim Kemal Ure
- Abstract summary: We propose a self-improving artificial intelligence system to enhance the safety performance of reinforcement learning (RL)-based autonomous driving (AD) agents.
Our approach efficiently discovers safety failures of action decisions in RL-based adaptive cruise control (ACC) applications.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we propose a self-improving artificial intelligence system to
enhance the safety performance of reinforcement learning (RL)-based autonomous
driving (AD) agents using black-box verification methods. RL algorithms have
become popular in AD applications in recent years. However, the performance of
existing RL algorithms heavily depends on the diversity of training scenarios.
A lack of safety-critical scenarios during the training phase could result in
poor generalization performance in real-world driving applications. We propose
a novel framework in which the weaknesses of the training set are explored
through black-box verification methods. After discovering AD failure scenarios,
the RL agent's training is re-initiated via transfer learning to improve the
performance of previously unsafe scenarios. Simulation results demonstrate that
our approach efficiently discovers safety failures of action decisions in
RL-based adaptive cruise control (ACC) applications and significantly reduces
the number of vehicle collisions through iterative applications of our method.
The source code is publicly available at
https://github.com/data-and-decision-lab/self-improving-RL.
Related papers
- CIMRL: Combining IMitation and Reinforcement Learning for Safe Autonomous Driving [45.05135725542318]
We propose a framework that enables training driving policies in simulation through leveraging imitative motion priors and safety constraints.
By combining RL and imitation, we demonstrate that our method achieves state-of-the-art results in closed loop simulation driving benchmarks.
arXiv Detail & Related papers (2024-06-13T07:31:29Z) - Approximate Model-Based Shielding for Safe Reinforcement Learning [83.55437924143615]
We propose a principled look-ahead shielding algorithm for verifying the performance of learned RL policies.
Our algorithm differs from other shielding approaches in that it does not require prior knowledge of the safety-relevant dynamics of the system.
We demonstrate superior performance to other safety-aware approaches on a set of Atari games with state-dependent safety-labels.
arXiv Detail & Related papers (2023-07-27T15:19:45Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - DeFIX: Detecting and Fixing Failure Scenarios with Reinforcement
Learning in Imitation Learning Based Autonomous Driving [0.0]
We present a Reinforcement Learning (RL) based methodology to DEtect and FIX failures of an IL agent.
DeFIX is a continuous learning framework, where extraction of failure scenarios and training of RL agents are executed in an infinite loop.
It is demonstrated that even with only one RL agent trained on failure scenario of an IL agent, DeFIX method is either competitive or does outperform state-of-the-art IL and RL based autonomous urban driving benchmarks.
arXiv Detail & Related papers (2022-10-29T10:58:43Z) - FIRE: A Failure-Adaptive Reinforcement Learning Framework for Edge Computing Migrations [52.85536740465277]
FIRE is a framework that adapts to rare events by training a RL policy in an edge computing digital twin environment.
We propose ImRE, an importance sampling-based Q-learning algorithm, which samples rare events proportionally to their impact on the value function.
We show that FIRE reduces costs compared to vanilla RL and the greedy baseline in the event of failures.
arXiv Detail & Related papers (2022-09-28T19:49:39Z) - Ablation Study of How Run Time Assurance Impacts the Training and
Performance of Reinforcement Learning Agents [5.801944210870593]
We conduct an ablation study using evaluation best practices to investigate the impact of run time assurance (RTA) on effective learning.
Our conclusions shed light on the most promising directions of Safe Reinforcement Learning.
arXiv Detail & Related papers (2022-07-08T20:15:15Z) - Constrained Reinforcement Learning for Robotics via Scenario-Based
Programming [64.07167316957533]
It is crucial to optimize the performance of DRL-based agents while providing guarantees about their behavior.
This paper presents a novel technique for incorporating domain-expert knowledge into a constrained DRL training loop.
Our experiments demonstrate that using our approach to leverage expert knowledge dramatically improves the safety and the performance of the agent.
arXiv Detail & Related papers (2022-06-20T07:19:38Z) - SafeRL-Kit: Evaluating Efficient Reinforcement Learning Methods for Safe
Autonomous Driving [12.925039760573092]
We release SafeRL-Kit to benchmark safe RL methods for autonomous driving tasks.
SafeRL-Kit contains several latest algorithms specific to zero-constraint-violation tasks, including Safety Layer, Recovery RL, off-policy Lagrangian method, and Feasible Actor-Critic.
We conduct a comparative evaluation of the above algorithms in SafeRL-Kit and shed light on their efficacy for safe autonomous driving.
arXiv Detail & Related papers (2022-06-17T03:23:51Z) - Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning [63.53407136812255]
Offline Reinforcement Learning promises to learn effective policies from previously-collected, static datasets without the need for exploration.
Existing Q-learning and actor-critic based off-policy RL algorithms fail when bootstrapping from out-of-distribution (OOD) actions or states.
We propose Uncertainty Weighted Actor-Critic (UWAC), an algorithm that detects OOD state-action pairs and down-weights their contribution in the training objectives accordingly.
arXiv Detail & Related papers (2021-05-17T20:16:46Z) - Safe Reinforcement Learning Using Robust Action Governor [6.833157102376731]
Reinforcement Learning (RL) is essentially a trial-and-error learning procedure which may cause unsafe behavior during the exploration-and-exploitation process.
In this paper, we introduce a framework for safe RL that is based on integration of an RL algorithm with an add-on safety supervision module.
We illustrate this proposed safe RL framework through an application to automotive adaptive cruise control.
arXiv Detail & Related papers (2021-02-21T16:50:17Z) - Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs.
We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.