INSIGHT: Enhancing Autonomous Driving Safety through Vision-Language Models on Context-Aware Hazard Detection and Edge Case Evaluation
- URL: http://arxiv.org/abs/2502.00262v2
- Date: Tue, 04 Feb 2025 03:28:23 GMT
- Title: INSIGHT: Enhancing Autonomous Driving Safety through Vision-Language Models on Context-Aware Hazard Detection and Edge Case Evaluation
- Authors: Dianwei Chen, Zifan Zhang, Yuchen Liu, Xianfeng Terry Yang,
- Abstract summary: INSIGHT is a hierarchical vision-language model (VLM) framework designed to enhance hazard detection and edge-case evaluation.
By using multimodal data fusion, our approach integrates semantic and visual representations, enabling precise interpretation of driving scenarios.
Experimental results on the BDD100K dataset demonstrate a substantial improvement in hazard prediction straightforwardness and accuracy over existing models.
- Score: 7.362380225654904
- License:
- Abstract: Autonomous driving systems face significant challenges in handling unpredictable edge-case scenarios, such as adversarial pedestrian movements, dangerous vehicle maneuvers, and sudden environmental changes. Current end-to-end driving models struggle with generalization to these rare events due to limitations in traditional detection and prediction approaches. To address this, we propose INSIGHT (Integration of Semantic and Visual Inputs for Generalized Hazard Tracking), a hierarchical vision-language model (VLM) framework designed to enhance hazard detection and edge-case evaluation. By using multimodal data fusion, our approach integrates semantic and visual representations, enabling precise interpretation of driving scenarios and accurate forecasting of potential dangers. Through supervised fine-tuning of VLMs, we optimize spatial hazard localization using attention-based mechanisms and coordinate regression techniques. Experimental results on the BDD100K dataset demonstrate a substantial improvement in hazard prediction straightforwardness and accuracy over existing models, achieving a notable increase in generalization performance. This advancement enhances the robustness and safety of autonomous driving systems, ensuring improved situational awareness and potential decision-making in complex real-world scenarios.
Related papers
- Black-Box Adversarial Attack on Vision Language Models for Autonomous Driving [65.61999354218628]
We take the first step toward designing black-box adversarial attacks specifically targeting vision-language models (VLMs) in autonomous driving systems.
We propose Cascading Adversarial Disruption (CAD), which targets low-level reasoning breakdown by generating and injecting semantics.
We present Risky Scene Induction, which addresses dynamic adaptation by leveraging a surrogate VLM to understand and construct high-level risky scenarios.
arXiv Detail & Related papers (2025-01-23T11:10:02Z) - When, Where, and What? A Novel Benchmark for Accident Anticipation and Localization with Large Language Models [14.090582912396467]
This study introduces a novel framework that integrates Large Language Models (LLMs) to enhance predictive capabilities across multiple dimensions.
We develop an innovative chain-based attention mechanism that dynamically adjusts to prioritize high-risk elements within complex driving scenes.
Empirical validation on the DAD, CCD, and A3D datasets demonstrates superior performance in Average Precision (AP) and Mean Time-To-Accident (mTTA)
arXiv Detail & Related papers (2024-07-23T08:29:49Z) - Risk-Aware Vehicle Trajectory Prediction Under Safety-Critical Scenarios [25.16311876790003]
This paper proposes a risk-aware trajectory prediction framework tailored to safety-critical scenarios.
We introduce a safety-critical trajectory prediction dataset and tailored evaluation metrics.
Results demonstrate the superior performance of our model, with a significant improvement in most metrics.
arXiv Detail & Related papers (2024-07-18T13:00:01Z) - Towards Safe and Reliable Autonomous Driving: Dynamic Occupancy Set Prediction [12.336412741837407]
This study introduces a novel method for Dynamic Occupancy Set (DOS) prediction, it effectively combines advanced trajectory prediction networks with a DOS prediction module.
The innovative contributions of this study include the development of a novel DOS prediction model specifically tailored for navigating complex scenarios.
arXiv Detail & Related papers (2024-02-29T17:36:39Z) - SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries [94.84458417662407]
We introduce SAFE-SIM, a controllable closed-loop safety-critical simulation framework.
Our approach yields two distinct advantages: 1) generating realistic long-tail safety-critical scenarios that closely reflect real-world conditions, and 2) providing controllable adversarial behavior for more comprehensive and interactive evaluations.
We validate our framework empirically using the nuScenes and nuPlan datasets across multiple planners, demonstrating improvements in both realism and controllability.
arXiv Detail & Related papers (2023-12-31T04:14:43Z) - Empowering Autonomous Driving with Large Language Models: A Safety Perspective [82.90376711290808]
This paper explores the integration of Large Language Models (LLMs) into Autonomous Driving systems.
LLMs are intelligent decision-makers in behavioral planning, augmented with a safety verifier shield for contextual safety learning.
We present two key studies in a simulated environment: an adaptive LLM-conditioned Model Predictive Control (MPC) and an LLM-enabled interactive behavior planning scheme with a state machine.
arXiv Detail & Related papers (2023-11-28T03:13:09Z) - Unsupervised Self-Driving Attention Prediction via Uncertainty Mining
and Knowledge Embedding [51.8579160500354]
We propose an unsupervised way to predict self-driving attention by uncertainty modeling and driving knowledge integration.
Results show equivalent or even more impressive performance compared to fully-supervised state-of-the-art approaches.
arXiv Detail & Related papers (2023-03-17T00:28:33Z) - USC: Uncompromising Spatial Constraints for Safety-Oriented 3D Object Detectors in Autonomous Driving [7.355977594790584]
We consider the safety-oriented performance of 3D object detectors in autonomous driving contexts.
We present uncompromising spatial constraints (USC), which characterize a simple yet important localization requirement.
We incorporate the quantitative measures into common loss functions to enable safety-oriented fine-tuning for existing models.
arXiv Detail & Related papers (2022-09-21T14:03:08Z) - AdvDO: Realistic Adversarial Attacks for Trajectory Prediction [87.96767885419423]
Trajectory prediction is essential for autonomous vehicles to plan correct and safe driving behaviors.
We devise an optimization-based adversarial attack framework to generate realistic adversarial trajectories.
Our attack can lead an AV to drive off road or collide into other vehicles in simulation.
arXiv Detail & Related papers (2022-09-19T03:34:59Z) - I Know You Can't See Me: Dynamic Occlusion-Aware Safety Validation of
Strategic Planners for Autonomous Vehicles Using Hypergames [12.244501203346566]
We develop a novel multi-agent dynamic occlusion risk measure for assessing situational risk.
We present a white-box, scenario-based, accelerated safety validation framework for assessing safety of strategic planners in AV.
arXiv Detail & Related papers (2021-09-20T19:38:14Z) - Generating and Characterizing Scenarios for Safety Testing of Autonomous
Vehicles [86.9067793493874]
We propose efficient mechanisms to characterize and generate testing scenarios using a state-of-the-art driving simulator.
We use our method to characterize real driving data from the Next Generation Simulation (NGSIM) project.
We rank the scenarios by defining metrics based on the complexity of avoiding accidents and provide insights into how the AV could have minimized the probability of incurring an accident.
arXiv Detail & Related papers (2021-03-12T17:00:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.