Related papers: Long and Short-Term Constraints Driven Safe Reinforcement Learning for Autonomous Driving

Long and Short-Term Constraints Driven Safe Reinforcement Learning for Autonomous Driving

URL: http://arxiv.org/abs/2403.18209v1
Date: Wed, 27 Mar 2024 02:41:52 GMT
Title: Long and Short-Term Constraints Driven Safe Reinforcement Learning for Autonomous Driving
Authors: Xuemin Hu, Pan Chen, Yijun Wen, Bo Tang, Long Chen,
Abstract summary: Reinforcement learning (RL) has been widely used in decision-making tasks, but it cannot guarantee the agent's safety in the training process. We propose a novel algorithm based on the long and short-term constraints (LSTC) for safe RL. We develop a safe RL method with dual-constraint optimization based on the Lagrange multiplier to optimize the training process for end-to-end autonomous driving.
Score: 11.072917563013428
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reinforcement learning (RL) has been widely used in decision-making tasks, but it cannot guarantee the agent's safety in the training process due to the requirements of interaction with the environment, which seriously limits its industrial applications such as autonomous driving. Safe RL methods are developed to handle this issue by constraining the expected safety violation costs as a training objective, but they still permit unsafe state occurrence, which is unacceptable in autonomous driving tasks. Moreover, these methods are difficult to achieve a balance between the cost and return expectations, which leads to learning performance degradation for the algorithms. In this paper, we propose a novel algorithm based on the long and short-term constraints (LSTC) for safe RL. The short-term constraint aims to guarantee the short-term state safety that the vehicle explores, while the long-term constraint ensures the overall safety of the vehicle throughout the decision-making process. In addition, we develop a safe RL method with dual-constraint optimization based on the Lagrange multiplier to optimize the training process for end-to-end autonomous driving. Comprehensive experiments were conducted on the MetaDrive simulator. Experimental results demonstrate that the proposed method achieves higher safety in continuous state and action tasks, and exhibits higher exploration performance in long-distance decision-making tasks compared with state-of-the-art methods.

Related papers

Distributional Soft Actor-Critic with Harmonic Gradient for Safe and Efficient Autonomous Driving in Multi-lane Scenarios [16.23857092084669]
We propose a new safety-oriented training technique called harmonic policy iteration (HPI)<n>At each RL iteration, it first calculates two policy gradients associated with efficient driving and safety constraints, respectively.<n>A harmonic gradient is derived for policy updating, minimizing conflicts between the two gradients.<n>We adopt the state-of-the-art DSAC algorithm as the backbone and integrate it with our HPI to develop a new safe RL algorithm, DSAC-H.
arXiv Detail & Related papers (2025-05-18T11:35:57Z)
TeLL-Drive: Enhancing Autonomous Driving with Teacher LLM-Guided Deep Reinforcement Learning [61.33599727106222]
TeLL-Drive is a hybrid framework that integrates a Teacher LLM to guide an attention-based Student DRL policy. A self-attention mechanism then fuses these strategies with the DRL agent's exploration, accelerating policy convergence and boosting robustness.
arXiv Detail & Related papers (2025-02-03T14:22:03Z)
Enhanced Safety in Autonomous Driving: Integrating Latent State Diffusion Model for End-to-End Navigation [5.928213664340974]
This research addresses the safety issue in the control optimization problem of autonomous driving. We propose a novel, model-based approach for policy optimization, utilizing a conditional Value-at-Risk based Soft Actor Critic. Our method introduces a worst-case actor to guide safe exploration, ensuring rigorous adherence to safety requirements even in unpredictable scenarios.
arXiv Detail & Related papers (2024-07-08T18:32:40Z)
CIMRL: Combining IMitation and Reinforcement Learning for Safe Autonomous Driving [45.05135725542318]
IMitation and Reinforcement Learning (CIMRL) approach enables training driving policies in simulation through leveraging imitative motion priors and safety constraints. By combining RL and imitation, we demonstrate our method achieves state-of-the-art results in closed loop simulation and real world driving benchmarks.
arXiv Detail & Related papers (2024-06-13T07:31:29Z)
Safety through Permissibility: Shield Construction for Fast and Safe Reinforcement Learning [57.84059344739159]
"Shielding" is a popular technique to enforce safety inReinforcement Learning (RL) We propose a new permissibility-based framework to deal with safety and shield construction.
arXiv Detail & Related papers (2024-05-29T18:00:21Z)
RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes [57.319845580050924]
We propose a reinforcement learning framework that combines risk-sensitive control with an adaptive action space curriculum. We show that our algorithm is capable of learning high-speed policies for a real-world off-road driving task.
arXiv Detail & Related papers (2024-05-07T23:32:36Z)
Safeguarded Progress in Reinforcement Learning: Safe Bayesian Exploration for Control Policy Synthesis [63.532413807686524]
This paper addresses the problem of maintaining safety during training in Reinforcement Learning (RL) We propose a new architecture that handles the trade-off between efficient progress and safety during exploration.
arXiv Detail & Related papers (2023-12-18T16:09:43Z)
Evaluation of Safety Constraints in Autonomous Navigation with Deep Reinforcement Learning [62.997667081978825]
We compare two learnable navigation policies: safe and unsafe. The safe policy takes the constraints into the account, while the other does not. We show that the safe policy is able to generate trajectories with more clearance (distance to the obstacles) and makes less collisions while training without sacrificing the overall performance.
arXiv Detail & Related papers (2023-07-27T01:04:57Z)
Towards Safe Autonomous Driving Policies using a Neuro-Symbolic Deep Reinforcement Learning Approach [6.961253535504979]
This paper introduces a novel neuro-symbolic model-free DRL approach, called DRL with Symbolic Logics (DRLSL) It combines the strengths of DRL (learning from experience) and symbolic first-order logics (knowledge-driven reasoning) to enable safe learning in real-time interactions of autonomous driving within real environments. We have implemented the DRLSL framework in autonomous driving using the highD dataset and demonstrated that our method successfully avoids unsafe actions during both the training and testing phases.
arXiv Detail & Related papers (2023-07-03T19:43:21Z)
Safety Correction from Baseline: Towards the Risk-aware Policy in Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent. Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control. The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z)
Evaluating Model-free Reinforcement Learning toward Safety-critical Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL. We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection. To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z)
High-level Decisions from a Safe Maneuver Catalog with Reinforcement Learning for Safe and Cooperative Automated Merging [5.732271870257913]
We propose an efficient RL-based decision-making pipeline for safe and cooperative automated driving in merging scenarios. The proposed RL agent can efficiently identify cooperative drivers from their vehicle state history and generate interactive maneuvers.
arXiv Detail & Related papers (2021-07-15T15:49:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.