Related papers: Continual Driving Policy Optimization with Closed-Loop Individualized Curricula

Continual Driving Policy Optimization with Closed-Loop Individualized Curricula

URL: http://arxiv.org/abs/2309.14209v3
Date: Wed, 6 Mar 2024 07:55:17 GMT
Title: Continual Driving Policy Optimization with Closed-Loop Individualized Curricula
Authors: Haoyi Niu, Yizhou Xu, Xingjian Jiang, Jianming Hu
Abstract summary: We develop a continuous driving policy optimization framework featuring Closed-Loop Individualized Curricula (CLIC) CLIC frames AV Evaluation as a collision prediction task, where it estimates the chance of AV failures in these scenarios at each iteration. We show that CLIC surpasses other curriculum-based training strategies, showing substantial improvement in managing risky scenarios.
Score: 3.171483862183451
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The safety of autonomous vehicles (AV) has been a long-standing top concern, stemming from the absence of rare and safety-critical scenarios in the long-tail naturalistic driving distribution. To tackle this challenge, a surge of research in scenario-based autonomous driving has emerged, with a focus on generating high-risk driving scenarios and applying them to conduct safety-critical testing of AV models. However, limited work has been explored on the reuse of these extensive scenarios to iteratively improve AV models. Moreover, it remains intractable and challenging to filter through gigantic scenario libraries collected from other AV models with distinct behaviors, attempting to extract transferable information for current AV improvement. Therefore, we develop a continual driving policy optimization framework featuring Closed-Loop Individualized Curricula (CLIC), which we factorize into a set of standardized sub-modules for flexible implementation choices: AV Evaluation, Scenario Selection, and AV Training. CLIC frames AV Evaluation as a collision prediction task, where it estimates the chance of AV failures in these scenarios at each iteration. Subsequently, by re-sampling from historical scenarios based on these failure probabilities, CLIC tailors individualized curricula for downstream training, aligning them with the evaluated capability of AV. Accordingly, CLIC not only maximizes the utilization of the vast pre-collected scenario library for closed-loop driving policy optimization but also facilitates AV improvement by individualizing its training with more challenging cases out of those poorly organized scenarios. Experimental results clearly indicate that CLIC surpasses other curriculum-based training strategies, showing substantial improvement in managing risky scenarios, while still maintaining proficiency in handling simpler cases.

Related papers

From Failures to Fixes: LLM-Driven Scenario Repair for Self-Evolving Autonomous Driving [29.36624509719055]
We propose textbfSERA, a framework that enables autonomous driving systems to self-evolve by repairing failure cases through targeted scenario recommendation.<n>By analyzing performance logs, SERA identifies failure patterns and dynamically retrieves semantically aligned scenarios from a structured bank.<n>Experiments on the benchmark show that SERA consistently improves key metrics across multiple autonomous driving baselines, demonstrating its effectiveness and generalizability under safety-critical conditions.
arXiv Detail & Related papers (2025-05-28T07:46:19Z)
Representation-based Reward Modeling for Efficient Safety Alignment of Large Language Model [84.00480999255628]
Reinforcement Learning algorithms for safety alignment of Large Language Models (LLMs) encounter the challenge of distribution shift. Current approaches typically address this issue through online sampling from the target policy. We propose a new framework that leverages the model's intrinsic safety judgment capability to extract reward signals.
arXiv Detail & Related papers (2025-03-13T06:40:34Z)
CurricuVLM: Towards Safe Autonomous Driving via Personalized Safety-Critical Curriculum Learning with Vision-Language Models [1.6612510324510592]
CurricuVLM is a novel framework that enables personalized curriculum learning for autonomous driving agents. Our approach exploits Vision-Language Models (VLMs) to analyze agent behavior, identify performance weaknesses, and dynamically generate tailored training scenarios. CurricuVLM outperforms state-of-the-art baselines across both regular and safety-critical scenarios.
arXiv Detail & Related papers (2025-02-21T00:42:40Z)
Black-Box Adversarial Attack on Vision Language Models for Autonomous Driving [65.61999354218628]
We take the first step toward designing black-box adversarial attacks specifically targeting vision-language models (VLMs) in autonomous driving systems.<n>We propose Cascading Adversarial Disruption (CAD), which targets low-level reasoning breakdown by generating and injecting semantics.<n>We present Risky Scene Induction, which addresses dynamic adaptation by leveraging a surrogate VLM to understand and construct high-level risky scenarios.
arXiv Detail & Related papers (2025-01-23T11:10:02Z)
CRASH: Challenging Reinforcement-Learning Based Adversarial Scenarios For Safety Hardening [16.305837225117607]
This paper introduces CRASH - Challenging Reinforcement-learning based Adversarial scenarios for Safety Hardening. First CRASH can control adversarial Non Player Character (NPC) agents in an AV simulator to automatically induce collisions with the Ego vehicle. We also propose a novel approach, that we term safety hardening, which iteratively refines the motion planner by simulating improvement scenarios against adversarial agents.
arXiv Detail & Related papers (2024-11-26T00:00:27Z)
Automated and Complete Generation of Traffic Scenarios at Road Junctions Using a Multi-level Danger Definition [2.5608506499175094]
We propose an approach to derive a complete set of (potentially dangerous) abstract scenarios at any given road junction. From these abstract scenarios, we derive exact paths that actors must follow to guide simulation-based testing. Results show that the AV-under-test is involved in increasing percentages of unsafe behaviors in simulation.
arXiv Detail & Related papers (2024-10-09T17:23:51Z)
FREA: Feasibility-Guided Generation of Safety-Critical Scenarios with Reasonable Adversariality [13.240598841087841]
We introduce FREA, a novel safety-critical scenarios generation method that incorporates the Largest Feasible Region (LFR) of AV as guidance. Experiments illustrate that FREA can effectively generate safety-critical scenarios, yielding considerable near-miss events.
arXiv Detail & Related papers (2024-06-05T06:26:15Z)
Uniformly Safe RL with Objective Suppression for Multi-Constraint Safety-Critical Applications [73.58451824894568]
The widely adopted CMDP model constrains the risks in expectation, which makes room for dangerous behaviors in long-tail states. In safety-critical domains, such behaviors could lead to disastrous outcomes. We propose Objective Suppression, a novel method that adaptively suppresses the task reward maximizing objectives according to a safety critic.
arXiv Detail & Related papers (2024-02-23T23:22:06Z)
A novel framework for adaptive stress testing of autonomous vehicles in highways [3.2112502548606825]
We propose a novel framework to explore corner cases that can result in safety concerns in a highway traffic scenario. We develop a new reward function for DRL to guide the AST in identifying crash scenarios based on the collision probability estimate. The proposed framework is further integrated with a new driving model enabling us to create more realistic traffic scenarios.
arXiv Detail & Related papers (2024-02-19T04:02:40Z)
SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries [94.84458417662407]
We introduce SAFE-SIM, a controllable closed-loop safety-critical simulation framework. Our approach yields two distinct advantages: 1) generating realistic long-tail safety-critical scenarios that closely reflect real-world conditions, and 2) providing controllable adversarial behavior for more comprehensive and interactive evaluations. We validate our framework empirically using the nuScenes and nuPlan datasets across multiple planners, demonstrating improvements in both realism and controllability.
arXiv Detail & Related papers (2023-12-31T04:14:43Z)
Stackelberg Driver Model for Continual Policy Improvement in Scenario-Based Closed-Loop Autonomous Driving [5.765939495779461]
adversarial generation methods have emerged as a class of efficient approaches to synthesize safety-critical scenarios. We tailor the Stackelberg Driver Model (SDM) to accurately characterize the hierarchical nature of vehicle interaction dynamics. Our algorithm exhibits superior performance compared to several baselines especially in higher dimensional scenarios.
arXiv Detail & Related papers (2023-09-25T15:47:07Z)
A Counterfactual Safety Margin Perspective on the Scoring of Autonomous Vehicles' Riskiness [52.27309191283943]
This paper presents a data-driven framework for assessing the risk of different AVs' behaviors. We propose the notion of counterfactual safety margin, which represents the minimum deviation from nominal behavior that could cause a collision.
arXiv Detail & Related papers (2023-08-02T09:48:08Z)
Generating Useful Accident-Prone Driving Scenarios via a Learned Traffic Prior [135.78858513845233]
STRIVE is a method to automatically generate challenging scenarios that cause a given planner to produce undesirable behavior, like collisions. To maintain scenario plausibility, the key idea is to leverage a learned model of traffic motion in the form of a graph-based conditional VAE. A subsequent optimization is used to find a "solution" to the scenario, ensuring it is useful to improve the given planner.
arXiv Detail & Related papers (2021-12-09T18:03:27Z)
LookOut: Diverse Multi-Future Prediction and Planning for Self-Driving [139.33800431159446]
LookOut is an approach to jointly perceive the environment and predict a diverse set of futures from sensor data. We show that our model demonstrates significantly more diverse and sample-efficient motion forecasting in a large-scale self-driving dataset.
arXiv Detail & Related papers (2021-01-16T23:19:22Z)
Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts? [104.04999499189402]
Out-of-training-distribution (OOD) scenarios are a common challenge of learning agents at deployment. We propose an uncertainty-aware planning method, called emphrobust imitative planning (RIP) Our method can detect and recover from some distribution shifts, reducing the overconfident and catastrophic extrapolations in OOD scenes. We introduce an autonomous car novel-scene benchmark, textttCARNOVEL, to evaluate the robustness of driving agents to a suite of tasks with distribution shifts.
arXiv Detail & Related papers (2020-06-26T11:07:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.