Decision-Making under On-Ramp merge Scenarios by Distributional Soft
Actor-Critic Algorithm
- URL: http://arxiv.org/abs/2103.04535v1
- Date: Mon, 8 Mar 2021 03:57:32 GMT
- Title: Decision-Making under On-Ramp merge Scenarios by Distributional Soft
Actor-Critic Algorithm
- Authors: Yiting Kong, Yang Guan, Jingliang Duan, Shengbo Eben Li, Qi Sun,
Bingbing Nie
- Abstract summary: We propose an RL-based end-to-end decision-making method under a framework of offline training and online correction, called the Shielded Distributional Soft Actor-critic (SDSAC)
The results show that the SDSAC has the best safety performance compared to baseline algorithms and efficient driving simultaneously.
- Score: 10.258474373022075
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Merging into the highway from the on-ramp is an essential scenario for
automated driving. The decision-making under the scenario needs to balance the
safety and efficiency performance to optimize a long-term objective, which is
challenging due to the dynamic, stochastic, and adversarial characteristics.
The Rule-based methods often lead to conservative driving on this task while
the learning-based methods have difficulties meeting the safety requirements.
In this paper, we propose an RL-based end-to-end decision-making method under a
framework of offline training and online correction, called the Shielded
Distributional Soft Actor-critic (SDSAC). The SDSAC adopts the policy
evaluation with safety consideration and a safety shield parameterized with the
barrier function in its offline training and online correction, respectively.
These two measures support each other for better safety while not damaging the
efficiency performance severely. We verify the SDSAC on an on-ramp merge
scenario in simulation. The results show that the SDSAC has the best safety
performance compared to baseline algorithms and achieves efficient driving
simultaneously.
Related papers
- Enhanced Safety in Autonomous Driving: Integrating Latent State Diffusion Model for End-to-End Navigation [5.928213664340974]
This research addresses the safety issue in the control optimization problem of autonomous driving.
We propose a novel, model-based approach for policy optimization, utilizing a conditional Value-at-Risk based Soft Actor Critic.
Our method introduces a worst-case actor to guide safe exploration, ensuring rigorous adherence to safety requirements even in unpredictable scenarios.
arXiv Detail & Related papers (2024-07-08T18:32:40Z) - Towards Safe Load Balancing based on Control Barrier Functions and Deep
Reinforcement Learning [0.691367883100748]
We propose a safe learning-based load balancing algorithm for Software Defined-Wide Area Network (SD-WAN)
It is empowered by Deep Reinforcement Learning (DRL) combined with a Control Barrier Function (CBF)
We show that our approach delivers near-optimal Quality-of-Service (QoS) in terms of end-to-end delay while respecting safety requirements related to link capacity constraints.
arXiv Detail & Related papers (2024-01-10T19:43:12Z) - CAT: Closed-loop Adversarial Training for Safe End-to-End Driving [54.60865656161679]
Adversarial Training (CAT) is a framework for safe end-to-end driving in autonomous vehicles.
Cat aims to continuously improve the safety of driving agents by training the agent on safety-critical scenarios.
Cat can effectively generate adversarial scenarios countering the agent being trained.
arXiv Detail & Related papers (2023-10-19T02:49:31Z) - Safe Reinforcement Learning with Dual Robustness [10.455148541147796]
Reinforcement learning (RL) agents are vulnerable to adversarial disturbances.
We propose a systematic framework to unify safe RL and robust RL.
We also design a deep RL algorithm for practical implementation, called dually robust actor-critic (DRAC)
arXiv Detail & Related papers (2023-09-13T09:34:21Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - Model-Based Safe Reinforcement Learning with Time-Varying State and
Control Constraints: An Application to Intelligent Vehicles [13.40143623056186]
This paper proposes a safe RL algorithm for optimal control of nonlinear systems with time-varying state and control constraints.
A multi-step policy evaluation mechanism is proposed to predict the policy's safety risk under time-varying safety constraints and guide the policy to update safely.
The proposed algorithm outperforms several state-of-the-art RL algorithms in the simulated Safety Gym environment.
arXiv Detail & Related papers (2021-12-18T10:45:31Z) - Decision-making for Autonomous Vehicles on Highway: Deep Reinforcement
Learning with Continuous Action Horizon [14.059728921828938]
This paper utilizes the deep reinforcement learning (DRL) method to address the continuous-horizon decision-making problem on the highway.
The running objective of the ego automated vehicle is to execute an efficient and smooth policy without collision.
The PPO-DRL-based decision-making strategy is estimated from multiple perspectives, including the optimality, learning efficiency, and adaptability.
arXiv Detail & Related papers (2020-08-26T22:49:27Z) - Cautious Reinforcement Learning with Logical Constraints [78.96597639789279]
An adaptive safe padding forces Reinforcement Learning (RL) to synthesise optimal control policies while ensuring safety during the learning process.
Theoretical guarantees are available on the optimality of the synthesised policies and on the convergence of the learning algorithm.
arXiv Detail & Related papers (2020-02-26T00:01:08Z) - Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot
Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO)
We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.