Related papers: Combating Partial Perception Deficit in Autonomous Driving with Multimodal LLM Commonsense

Combating Partial Perception Deficit in Autonomous Driving with Multimodal LLM Commonsense

URL: http://arxiv.org/abs/2503.07020v1
Date: Mon, 10 Mar 2025 08:01:41 GMT
Title: Combating Partial Perception Deficit in Autonomous Driving with Multimodal LLM Commonsense
Authors: Yuting Hu, Chenhui Xu, Ruiyang Qin, Dancheng Liu, Amir Nassereldine, Yiyu Shi, Jinjun Xiong,
Abstract summary: LLM-RCO is a framework to integrate human-like driving commonsense into autonomous systems facing perception deficits.<n>DriveLM-Deficit is a dataset of 53,895 video clips featuring deficits of safety-critical objects.<n>Our results show that LLMs fine-tuned with DriveLM-Deficit can enable more proactive movements instead of conservative stops in the context of perception deficits.
Score: 19.797977882386736
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Partial perception deficits can compromise autonomous vehicle safety by disrupting environmental understanding. Current protocols typically respond with immediate stops or minimal-risk maneuvers, worsening traffic flow and lacking flexibility for rare driving scenarios. In this paper, we propose LLM-RCO, a framework leveraging large language models to integrate human-like driving commonsense into autonomous systems facing perception deficits. LLM-RCO features four key modules: hazard inference, short-term motion planner, action condition verifier, and safety constraint generator. These modules interact with the dynamic driving environment, enabling proactive and context-aware control actions to override the original control policy of autonomous agents. To improve safety in such challenging conditions, we construct DriveLM-Deficit, a dataset of 53,895 video clips featuring deficits of safety-critical objects, complete with annotations for LLM-based hazard inference and motion planning fine-tuning. Extensive experiments in adverse driving conditions with the CARLA simulator demonstrate that systems equipped with LLM-RCO significantly improve driving performance, highlighting its potential for enhancing autonomous driving resilience against adverse perception deficits. Our results also show that LLMs fine-tuned with DriveLM-Deficit can enable more proactive movements instead of conservative stops in the context of perception deficits.

Related papers

ROAD: Responsibility-Oriented Reward Design for Reinforcement Learning in Autonomous Driving [6.713954449470747]
This study introduces a responsibility-oriented reward function that explicitly incorporates traffic regulations into theReinforcement learning framework.<n>We introduce a Traffic Regulation Knowledge Graph and leveraged Vision-Language Models alongside Retrieval-Augmented Generation techniques to automate reward assignment.
arXiv Detail & Related papers (2025-05-30T08:00:51Z)
D4+: Emergent Adversarial Driving Maneuvers with Approximate Functional Optimization [3.763470738887407]
We implement a scenario-based framework with a formal method to identify the impact of malicious drivers interacting with autonomous vehicles.<n>Our results can help designers identify the range of safe operational behaviors that prevent malicious drivers from exploiting the autonomous features of modern vehicles.
arXiv Detail & Related papers (2025-05-20T05:22:03Z)
LD-Scene: LLM-Guided Diffusion for Controllable Generation of Adversarial Safety-Critical Driving Scenarios [3.6585028071015007]
LD-Scene is a novel framework that integrates Large Language Models (LLMs) with Latent Diffusion Models (LDMs) for user-controllable adversarial scenario generation through natural language.<n>Our approach comprises an LDM that captures realistic driving distributions and an LLM-based guidance module that translates user queries into adversarial loss functions.<n>Our framework provides fine-grained control over adversarial behaviors, thereby facilitating more effective testing tailored to specific driving scenarios.
arXiv Detail & Related papers (2025-05-16T13:41:05Z)
Dynamic Path Navigation for Motion Agents with LLM Reasoning [69.5875073447454]
Large Language Models (LLMs) have demonstrated strong generalizable reasoning and planning capabilities. We explore the zero-shot navigation and path generation capabilities of LLMs by constructing a dataset and proposing an evaluation protocol. We demonstrate that, when tasks are well-structured in this manner, modern LLMs exhibit substantial planning proficiency in avoiding obstacles while autonomously refining navigation with the generated motion to reach the target.
arXiv Detail & Related papers (2025-03-10T13:39:09Z)
Perceptual Motor Learning with Active Inference Framework for Robust Lateral Control [0.5437298646956507]
This paper presents a novel Perceptual Motor Learning framework integrated with Active Inference (AIF) to enhance lateral control in Highly Automated Vehicles (HAVs)<n>PML emphasizes the seamless integration of perception and action, enabling efficient decision-making in dynamic environments.<n>Our approach unifies deep learning with active inference principles, allowing HAVs to perform lane-keeping with minimal data and without extensive retraining across different environments.
arXiv Detail & Related papers (2025-03-03T15:49:18Z)
SafeAuto: Knowledge-Enhanced Safe Autonomous Driving with Multimodal Foundation Models [63.71984266104757]
Multimodal Large Language Models (MLLMs) can process both visual and textual data.<n>We propose SafeAuto, a novel framework that enhances MLLM-based autonomous driving systems by incorporating both unstructured and structured knowledge.
arXiv Detail & Related papers (2025-02-28T21:53:47Z)
TeLL-Drive: Enhancing Autonomous Driving with Teacher LLM-Guided Deep Reinforcement Learning [61.33599727106222]
TeLL-Drive is a hybrid framework that integrates a Teacher LLM to guide an attention-based Student DRL policy.<n>A self-attention mechanism then fuses these strategies with the DRL agent's exploration, accelerating policy convergence and boosting robustness.
arXiv Detail & Related papers (2025-02-03T14:22:03Z)
Towards Interactive and Learnable Cooperative Driving Automation: a Large Language Model-Driven Decision-Making Framework [79.088116316919]
Connected Autonomous Vehicles (CAVs) have begun to open road testing around the world, but their safety and efficiency performance in complex scenarios is still not satisfactory. This paper proposes CoDrivingLLM, an interactive and learnable LLM-driven cooperative driving framework.
arXiv Detail & Related papers (2024-09-19T14:36:00Z)
Probing Multimodal LLMs as World Models for Driving [72.18727651074563]
We look at the application of Multimodal Large Language Models (MLLMs) in autonomous driving. Despite advances in models like GPT-4o, their performance in complex driving environments remains largely unexplored.
arXiv Detail & Related papers (2024-05-09T17:52:42Z)
RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes [57.319845580050924]
We propose a reinforcement learning framework that combines risk-sensitive control with an adaptive action space curriculum. We show that our algorithm is capable of learning high-speed policies for a real-world off-road driving task.
arXiv Detail & Related papers (2024-05-07T23:32:36Z)
SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries [94.84458417662407]
We introduce SAFE-SIM, a controllable closed-loop safety-critical simulation framework. Our approach yields two distinct advantages: 1) generating realistic long-tail safety-critical scenarios that closely reflect real-world conditions, and 2) providing controllable adversarial behavior for more comprehensive and interactive evaluations. We validate our framework empirically using the nuScenes and nuPlan datasets across multiple planners, demonstrating improvements in both realism and controllability.
arXiv Detail & Related papers (2023-12-31T04:14:43Z)
Risk-anticipatory autonomous driving strategies considering vehicles' weights, based on hierarchical deep reinforcement learning [12.014977175887767]
This study develops an autonomous driving strategy based on risk anticipation, considering the weights of surrounding vehicles. A risk indicator integrating surrounding vehicles weights, based on the risk field theory, is proposed and incorporated into autonomous driving decisions. An indicator, potential collision energy in conflicts, is newly proposed to evaluate the performance of the developed AV driving strategy.
arXiv Detail & Related papers (2023-12-27T06:03:34Z)
Empowering Autonomous Driving with Large Language Models: A Safety Perspective [82.90376711290808]
This paper explores the integration of Large Language Models (LLMs) into Autonomous Driving systems. LLMs are intelligent decision-makers in behavioral planning, augmented with a safety verifier shield for contextual safety learning. We present two key studies in a simulated environment: an adaptive LLM-conditioned Model Predictive Control (MPC) and an LLM-enabled interactive behavior planning scheme with a state machine.
arXiv Detail & Related papers (2023-11-28T03:13:09Z)
Learned Risk Metric Maps for Kinodynamic Systems [54.49871675894546]
We present Learned Risk Metric Maps for real-time estimation of coherent risk metrics of high dimensional dynamical systems. LRMM models are simple to design and train, requiring only procedural generation of obstacle sets, state and control sampling, and supervised training of a function approximator.
arXiv Detail & Related papers (2023-02-28T17:51:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.