Related papers: AccidentGPT: Large Multi-Modal Foundation Model for Traffic Accident Analysis

AccidentGPT: Large Multi-Modal Foundation Model for Traffic Accident Analysis

URL: http://arxiv.org/abs/2401.03040v1
Date: Fri, 5 Jan 2024 19:33:21 GMT
Title: AccidentGPT: Large Multi-Modal Foundation Model for Traffic Accident Analysis
Authors: Kebin Wu and Wenbin Li and Xiaofei Xiao
Abstract summary: AccidentGPT is a foundation model of traffic accident analysis. It incorporates multi-modal input data to automatically reconstruct the accident process video with dynamics details.
Score: 3.8763079966791523
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Traffic accident analysis is pivotal for enhancing public safety and developing road regulations. Traditional approaches, although widely used, are often constrained by manual analysis processes, subjective decisions, uni-modal outputs, as well as privacy issues related to sensitive data. This paper introduces the idea of AccidentGPT, a foundation model of traffic accident analysis, which incorporates multi-modal input data to automatically reconstruct the accident process video with dynamics details, and furthermore provide multi-task analysis with multi-modal outputs. The design of the AccidentGPT is empowered with a multi-modality prompt with feedback for task-oriented adaptability, a hybrid training schema to leverage labelled and unlabelled data, and a edge-cloud split configuration for data privacy. To fully realize the functionalities of this model, we proposes several research opportunities. This paper serves as the stepping stone to fill the gaps in traditional approaches of traffic accident analysis and attract the research community attention for automatic, objective, and privacy-preserving traffic accident analysis.

Related papers

CrashAgent: Crash Scenario Generation via Multi-modal Reasoning [34.42773413989066]
We introduce CrashAgent, a framework designed to interpret multi-modal real-world traffic crash reports.<n>We evaluate the generated crash scenarios from multiple perspectives, including the accuracy of layout reconstruction, collision rate, and diversity.<n>The resulting high-quality and large-scale crash dataset will be publicly available to support the development of safe driving algorithms.
arXiv Detail & Related papers (2025-05-23T19:55:32Z)
Deep Learning Advances in Vision-Based Traffic Accident Anticipation: A Comprehensive Review of Methods,Datasets,and Future Directions [10.3325464784641]
Vision-based traffic accident anticipation (Vision-TAA) has emerged as a promising approach in the era of deep learning.<n>This paper reviews 147 recent studies,focusing on the application of supervised,unsupervised,and hybrid learning models for accident prediction.
arXiv Detail & Related papers (2025-05-12T14:34:22Z)
AVD2: Accident Video Diffusion for Accident Video Description [11.221276595088215]
We introduce AVD2 (Accident Video Diffusion for Accident Video Description), a novel framework that enhances accident scene understanding. The framework generates accident videos that align with detailed natural language descriptions and reasoning, resulting in the EMM-AU dataset. Empirical results reveal that the integration of the EMM-AU dataset establishes state-of-the-art performance across both automated metrics and human evaluations.
arXiv Detail & Related papers (2025-02-20T18:22:44Z)
When language and vision meet road safety: leveraging multimodal large language models for video-based traffic accident analysis [6.213279061986497]
SeeUnsafe is a framework that transforms video-based traffic accident analysis into a more interactive, conversational approach. Our framework employs a multimodal-based aggregation strategy to handle videos of various lengths and generate structured responses for review and evaluation. We conduct extensive experiments on the Toyota Woven Traffic Safety dataset, demonstrating that SeeUnsafe effectively performs accident-aware video classification and visual grounding.
arXiv Detail & Related papers (2025-01-17T23:35:34Z)
On-Road Object Importance Estimation: A New Dataset and A Model with Multi-Fold Top-Down Guidance [70.80612792049315]
This paper contributes a new large-scale dataset named Traffic Object Importance (TOI) It proposes a model that integrates multi-fold top-down guidance with the bottom-up feature. Our model outperforms state-of-the-art methods by large margins.
arXiv Detail & Related papers (2024-11-26T06:37:10Z)
Using Multimodal Large Language Models for Automated Detection of Traffic Safety Critical Events [5.233512464561313]
Multimodal Large Language Models (MLLMs) offer a novel approach by integrating textual, visual, and audio modalities. Our framework leverages the reasoning power of MLLMs, directing their output through context-specific prompts. Preliminary results demonstrate the framework's potential in zero-shot learning and accurate scenario analysis.
arXiv Detail & Related papers (2024-06-19T23:50:41Z)
Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses [76.59021017301127]
We propose a large-scale traffic crash language dataset, named CrashEvent, summarizing 19,340 real-world crash reports. We further formulate the crash event feature learning as a novel text reasoning problem and further fine-tune various large language models (LLMs) to predict detailed accident outcomes. Our experiments results show that our LLM-based approach not only predicts the severity of accidents but also classifies different types of accidents and predicts injury outcomes.
arXiv Detail & Related papers (2024-06-16T03:10:16Z)
AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving [68.73885845181242]
We propose an Automatic Data Engine (AIDE) that automatically identifies issues, efficiently curates data, improves the model through auto-labeling, and verifies the model through generation of diverse scenarios. We further establish a benchmark for open-world detection on AV datasets to comprehensively evaluate various learning paradigms, demonstrating our method's superior performance at a reduced cost.
arXiv Detail & Related papers (2024-03-26T04:27:56Z)
AccidentGPT: Accident Analysis and Prevention from V2X Environmental Perception with Multi-modal Large Model [32.14950866838055]
AccidentGPT is a comprehensive accident analysis and prevention multi-modal large model. For autonomous driving vehicles, we provide comprehensive environmental perception and understanding to control the vehicle and avoid collisions. For human-driven vehicles, we offer proactive long-range safety warnings and blind-spot alerts. Our framework supports intelligent and real-time analysis of traffic safety, encompassing pedestrian, vehicles, roads, and the environment.
arXiv Detail & Related papers (2023-12-20T16:19:47Z)
Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction [69.29802752614677]
RouteFormer is a novel ego-trajectory prediction network combining GPS data, environmental context, and the driver's field-of-view. To tackle data scarcity and enhance diversity, we introduce GEM, a dataset of urban driving scenarios enriched with synchronized driver field-of-view and gaze data.
arXiv Detail & Related papers (2023-12-13T23:06:30Z)
DRUformer: Enhancing the driving scene Important object detection with driving relationship self-understanding [50.81809690183755]
Traffic accidents frequently lead to fatal injuries, contributing to over 50 million deaths until 2023. Previous research primarily assessed the importance of individual participants, treating them as independent entities. We introduce Driving scene Relationship self-Understanding transformer (DRUformer) to enhance the important object detection task.
arXiv Detail & Related papers (2023-11-11T07:26:47Z)
Augmented Driver Behavior Models for High-Fidelity Simulation Study of Crash Detection Algorithms [2.064612766965483]
We present a simulation platform for a hybrid transportation system that includes both human-driven and automated vehicles. We decompose the human driving task and offer a modular approach to simulating a large-scale traffic scenario. We analyze a large driving dataset to extract expressive parameters that would best describe different driving characteristics.
arXiv Detail & Related papers (2022-08-10T19:59:16Z)
Deep Learning Serves Traffic Safety Analysis: A Forward-looking Review [4.228522109021283]
We present a typical processing pipeline, which can be used to understand and interpret traffic videos. This processing framework includes several steps, including video enhancement, video stabilization, semantic and incident segmentation, object detection and classification, trajectory extraction, speed estimation, event analysis, modeling and anomaly detection.
arXiv Detail & Related papers (2022-03-07T17:21:07Z)
A model for traffic incident prediction using emergency braking data [77.34726150561087]
We address the fundamental problem of data scarcity in road traffic accident prediction by training our model on emergency braking events instead of accidents. We present a prototype implementing a traffic incident prediction model for Germany based on emergency braking data from Mercedes-Benz vehicles.
arXiv Detail & Related papers (2021-02-12T18:17:12Z)
Multi-intersection Traffic Optimisation: A Benchmark Dataset and a Strong Baseline [85.9210953301628]
Control of traffic signals is fundamental and critical to alleviate traffic congestion in urban areas. Because of the high complexity of modelling the problem, experimental settings of current works are often inconsistent. We propose a novel and strong baseline model based on deep reinforcement learning with the encoder-decoder structure.
arXiv Detail & Related papers (2021-01-24T03:55:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.