Driving-Policy Adaptive Safeguard for Autonomous Vehicles Using
Reinforcement Learning
- URL: http://arxiv.org/abs/2012.01010v1
- Date: Wed, 2 Dec 2020 08:01:53 GMT
- Title: Driving-Policy Adaptive Safeguard for Autonomous Vehicles Using
Reinforcement Learning
- Authors: Zhong Cao, Shaobing Xu, Songan Zhang, Huei Peng, Diange Yang
- Abstract summary: This paper proposes a driving-policy adaptive safeguard (DPAS) design, including a collision avoidance strategy and an activation function.
The driving-policy adaptive activation function should dynamically assess current driving policy risk and kick in when an urgent threat is detected.
The results are calibrated by naturalistic driving data and show that the proposed safeguard reduces the collision rate significantly without introducing more interventions.
- Score: 19.71676985220504
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Safeguard functions such as those provided by advanced emergency braking
(AEB) can provide another layer of safety for autonomous vehicles (AV). A smart
safeguard function should adapt the activation conditions to the driving
policy, to avoid unnecessary interventions as well as improve vehicle safety.
This paper proposes a driving-policy adaptive safeguard (DPAS) design,
including a collision avoidance strategy and an activation function. The
collision avoidance strategy is designed in a reinforcement learning framework,
obtained by Monte-Carlo Tree Search (MCTS). It can learn from past collisions
and manipulate both braking and steering in stochastic traffics. The
driving-policy adaptive activation function should dynamically assess current
driving policy risk and kick in when an urgent threat is detected. To generate
this activation function, MCTS' exploration and rollout modules are designed to
fully evaluate the AV's current driving policy, and then explore other safer
actions. In this study, the DPAS is validated with two typical highway-driving
policies. The results are obtained through and 90,000 times in the stochastic
and aggressive simulated traffic. The results are calibrated by naturalistic
driving data and show that the proposed safeguard reduces the collision rate
significantly without introducing more interventions, compared with the
state-based benchmark safeguards. In summary, the proposed safeguard leverages
the learning-based method in stochastic and emergent scenarios and imposes
minimal influence on the driving policy.
Related papers
- A Conflicts-free, Speed-lossless KAN-based Reinforcement Learning Decision System for Interactive Driving in Roundabouts [17.434924472015812]
This paper introduces a learning-based algorithm tailored to foster safe and efficient driving behaviors in roundabouts.
The proposed algorithm employs a deep Q-learning network to learn safe and efficient driving strategies in complex multi-vehicle roundabouts.
The results show that our proposed system consistently achieves safe and efficient driving whilst maintaining a stable training process.
arXiv Detail & Related papers (2024-08-15T16:10:25Z) - RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes [57.319845580050924]
We propose a reinforcement learning framework that combines risk-sensitive control with an adaptive action space curriculum.
We show that our algorithm is capable of learning high-speed policies for a real-world off-road driving task.
arXiv Detail & Related papers (2024-05-07T23:32:36Z) - Risk-anticipatory autonomous driving strategies considering vehicles' weights, based on hierarchical deep reinforcement learning [12.014977175887767]
This study develops an autonomous driving strategy based on risk anticipation, considering the weights of surrounding vehicles.
A risk indicator integrating surrounding vehicles weights, based on the risk field theory, is proposed and incorporated into autonomous driving decisions.
An indicator, potential collision energy in conflicts, is newly proposed to evaluate the performance of the developed AV driving strategy.
arXiv Detail & Related papers (2023-12-27T06:03:34Z) - CAT: Closed-loop Adversarial Training for Safe End-to-End Driving [54.60865656161679]
Adversarial Training (CAT) is a framework for safe end-to-end driving in autonomous vehicles.
Cat aims to continuously improve the safety of driving agents by training the agent on safety-critical scenarios.
Cat can effectively generate adversarial scenarios countering the agent being trained.
arXiv Detail & Related papers (2023-10-19T02:49:31Z) - Robust Driving Policy Learning with Guided Meta Reinforcement Learning [49.860391298275616]
We introduce an efficient method to train diverse driving policies for social vehicles as a single meta-policy.
By randomizing the interaction-based reward functions of social vehicles, we can generate diverse objectives and efficiently train the meta-policy.
We propose a training strategy to enhance the robustness of the ego vehicle's driving policy using the environment where social vehicles are controlled by the learned meta-policy.
arXiv Detail & Related papers (2023-07-19T17:42:36Z) - Runtime Stealthy Perception Attacks against DNN-based Adaptive Cruise Control Systems [8.561553195784017]
This paper evaluates the security of the deep neural network based ACC systems under runtime perception attacks.
We present a context-aware strategy for the selection of the most critical times for triggering the attacks.
We evaluate the effectiveness of the proposed attack using an actual vehicle, a publicly available driving dataset, and a realistic simulation platform.
arXiv Detail & Related papers (2023-07-18T03:12:03Z) - Safe Reinforcement Learning for an Energy-Efficient Driver Assistance
System [1.8899300124593645]
Reinforcement learning (RL)-based driver assistance systems seek to improve fuel consumption via continual improvement of powertrain control actions.
In this paper, an exponential control barrier function (ECBF) is derived and utilized to filter unsafe actions proposed by an RL-based driver assistance system.
The proposed safe-RL scheme is trained and evaluated in car following scenarios where it is shown that it effectively avoids collision both during training and evaluation.
arXiv Detail & Related papers (2023-01-03T00:25:00Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - Automatically Learning Fallback Strategies with Model-Free Reinforcement
Learning in Safety-Critical Driving Scenarios [9.761912672523977]
We present a principled approach for a model-free Reinforcement Learning (RL) agent to capture multiple modes of behaviour in an environment.
We introduce an extra pseudo-reward term to the reward model, to encourage exploration to areas of state-space different from areas privileged by the optimal policy.
We show that we are able to learn useful policies that would have otherwise been missed out on during training, and unavailable to use when executing the control algorithm.
arXiv Detail & Related papers (2022-04-11T15:34:49Z) - Cautious Adaptation For Reinforcement Learning in Safety-Critical
Settings [129.80279257258098]
Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous.
We propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments.
We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk.
arXiv Detail & Related papers (2020-08-15T01:40:59Z) - Can Autonomous Vehicles Identify, Recover From, and Adapt to
Distribution Shifts? [104.04999499189402]
Out-of-training-distribution (OOD) scenarios are a common challenge of learning agents at deployment.
We propose an uncertainty-aware planning method, called emphrobust imitative planning (RIP)
Our method can detect and recover from some distribution shifts, reducing the overconfident and catastrophic extrapolations in OOD scenes.
We introduce an autonomous car novel-scene benchmark, textttCARNOVEL, to evaluate the robustness of driving agents to a suite of tasks with distribution shifts.
arXiv Detail & Related papers (2020-06-26T11:07:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.