Related papers: From Failures to Fixes: LLM-Driven Scenario Repair for Self-Evolving Autonomous Driving

From Failures to Fixes: LLM-Driven Scenario Repair for Self-Evolving Autonomous Driving

URL: http://arxiv.org/abs/2505.22067v1
Date: Wed, 28 May 2025 07:46:19 GMT
Title: From Failures to Fixes: LLM-Driven Scenario Repair for Self-Evolving Autonomous Driving
Authors: Xinyu Xia, Xingjun Ma, Yunfeng Hu, Ting Qu, Hong Chen, Xun Gong,
Abstract summary: We propose textbfSERA, a framework that enables autonomous driving systems to self-evolve by repairing failure cases through targeted scenario recommendation.<n>By analyzing performance logs, SERA identifies failure patterns and dynamically retrieves semantically aligned scenarios from a structured bank.<n>Experiments on the benchmark show that SERA consistently improves key metrics across multiple autonomous driving baselines, demonstrating its effectiveness and generalizability under safety-critical conditions.
Score: 29.36624509719055
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Ensuring robust and generalizable autonomous driving requires not only broad scenario coverage but also efficient repair of failure cases, particularly those related to challenging and safety-critical scenarios. However, existing scenario generation and selection methods often lack adaptivity and semantic relevance, limiting their impact on performance improvement. In this paper, we propose \textbf{SERA}, an LLM-powered framework that enables autonomous driving systems to self-evolve by repairing failure cases through targeted scenario recommendation. By analyzing performance logs, SERA identifies failure patterns and dynamically retrieves semantically aligned scenarios from a structured bank. An LLM-based reflection mechanism further refines these recommendations to maximize relevance and diversity. The selected scenarios are used for few-shot fine-tuning, enabling targeted adaptation with minimal data. Experiments on the benchmark show that SERA consistently improves key metrics across multiple autonomous driving baselines, demonstrating its effectiveness and generalizability under safety-critical conditions.

Related papers

AGENTS-LLM: Augmentative GENeration of Challenging Traffic Scenarios with an Agentic LLM Framework [29.10278896946722]
This paper introduces a novel LLM-agent based framework for augmenting real-world traffic scenarios using natural language descriptions.<n>A key innovation is the use of an agentic design, enabling fine-grained control over the output.
arXiv Detail & Related papers (2025-07-18T08:20:16Z)
SEAL: Vision-Language Model-Based Safe End-to-End Cooperative Autonomous Driving with Adaptive Long-Tail Modeling [13.81210267833274]
SEAL is a vision-based model-based framework with adaptive multimodal learning for robust cooperative autonomous driving under long-tail scenarios.<n> SEAL introduces three core innovations: (i) a prompt-driven long-tail scenario generation and evaluation pipeline that leverages foundation models to synthesize realistic long-tail conditions; (ii) a multi-scenario adaptive attention module that modulates the visual stream using scenario priors to recalibrate ambiguous or corrupted features; and (iii) a multi-task scenario-aware contrastive learning objective that improves multimodal alignment and promotes cross-scenario feature separability.
arXiv Detail & Related papers (2025-06-26T06:42:03Z)
MSDA: Combining Pseudo-labeling and Self-Supervision for Unsupervised Domain Adaptation in ASR [59.83547898874152]
We introduce a sample-efficient, two-stage adaptation approach that integrates self-supervised learning with semi-supervised techniques.<n>MSDA is designed to enhance the robustness and generalization of ASR models.<n>We demonstrate that Meta PL can be applied effectively to ASR tasks, achieving state-of-the-art results.
arXiv Detail & Related papers (2025-05-30T14:46:05Z)
LADs: Leveraging LLMs for AI-Driven DevOps [3.240228178267042]
LADs is a principled approach to configuration optimization through in-depth analysis of what optimization works under which conditions.<n>By leveraging Retrieval-Augmented Generation, Few-Shot Learning, Chain-of-Thought, and Feedback-Based Prompt Chaining, LADs generates accurate configurations and learns from deployment failures to iteratively refine system settings.<n>Our findings reveal key insights into the trade-offs between performance, cost, and scalability, helping practitioners determine the right strategies for different deployment scenarios.
arXiv Detail & Related papers (2025-02-28T08:12:08Z)
From Words to Collisions: LLM-Guided Evaluation and Adversarial Generation of Safety-Critical Driving Scenarios [6.681744368557208]
Large Language Models (LLMs) and structured scenario parsing and prompt engineering are used to generate safety-critical driving scenarios.<n>We validate our approach using a 2D simulation framework and multiple pre-trained LLMs.<n>We conclude that an LLM equipped with domain-informed prompting techniques can effectively evaluate and generate safety-critical driving scenarios.
arXiv Detail & Related papers (2025-02-04T09:19:13Z)
Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes [50.544186914115045]
Large language models (LLMs) are increasingly embedded in everyday applications.<n> Ensuring their alignment with the diverse preferences of individual users has become a critical challenge.<n>We present a novel framework for few-shot steerable alignment.
arXiv Detail & Related papers (2024-12-18T16:14:59Z)
Generating Out-Of-Distribution Scenarios Using Language Models [58.47597351184034]
Large Language Models (LLMs) have shown promise in autonomous driving. This paper introduces a framework for generating diverse Out-Of-Distribution (OOD) driving scenarios. We evaluate our framework through extensive simulations and introduce a new "OOD-ness" metric.
arXiv Detail & Related papers (2024-11-25T16:38:17Z)
SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries [94.84458417662407]
We introduce SAFE-SIM, a controllable closed-loop safety-critical simulation framework. Our approach yields two distinct advantages: 1) generating realistic long-tail safety-critical scenarios that closely reflect real-world conditions, and 2) providing controllable adversarial behavior for more comprehensive and interactive evaluations. We validate our framework empirically using the nuScenes and nuPlan datasets across multiple planners, demonstrating improvements in both realism and controllability.
arXiv Detail & Related papers (2023-12-31T04:14:43Z)
Continual Driving Policy Optimization with Closed-Loop Individualized Curricula [2.903150959383393]
We develop a continuous driving policy optimization framework featuring Closed-Loop Individualized Curricula (CLIC) CLIC frames AV Evaluation as a collision prediction task, where it estimates the chance of AV failures in these scenarios at each iteration. We show that CLIC surpasses other curriculum-based training strategies, showing substantial improvement in managing risky scenarios.
arXiv Detail & Related papers (2023-09-25T15:14:54Z)
When Demonstrations Meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent. Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z)
Generating Useful Accident-Prone Driving Scenarios via a Learned Traffic Prior [135.78858513845233]
STRIVE is a method to automatically generate challenging scenarios that cause a given planner to produce undesirable behavior, like collisions. To maintain scenario plausibility, the key idea is to leverage a learned model of traffic motion in the form of a graph-based conditional VAE. A subsequent optimization is used to find a "solution" to the scenario, ensuring it is useful to improve the given planner.
arXiv Detail & Related papers (2021-12-09T18:03:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.