Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses
- URL: http://arxiv.org/abs/2406.10789v1
- Date: Sun, 16 Jun 2024 03:10:16 GMT
- Title: Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses
- Authors: Zhiwen Fan, Pu Wang, Yang Zhao, Yibo Zhao, Boris Ivanovic, Zhangyang Wang, Marco Pavone, Hao Frank Yang,
- Abstract summary: We propose a large-scale traffic crash language dataset, named CrashEvent, summarizing 19,340 real-world crash reports.
We further formulate the crash event feature learning as a novel text reasoning problem and further fine-tune various large language models (LLMs) to predict detailed accident outcomes.
Our experiments results show that our LLM-based approach not only predicts the severity of accidents but also classifies different types of accidents and predicts injury outcomes.
- Score: 76.59021017301127
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The increasing rate of road accidents worldwide results not only in significant loss of life but also imposes billions financial burdens on societies. Current research in traffic crash frequency modeling and analysis has predominantly approached the problem as classification tasks, focusing mainly on learning-based classification or ensemble learning methods. These approaches often overlook the intricate relationships among the complex infrastructure, environmental, human and contextual factors related to traffic crashes and risky situations. In contrast, we initially propose a large-scale traffic crash language dataset, named CrashEvent, summarizing 19,340 real-world crash reports and incorporating infrastructure data, environmental and traffic textual and visual information in Washington State. Leveraging this rich dataset, we further formulate the crash event feature learning as a novel text reasoning problem and further fine-tune various large language models (LLMs) to predict detailed accident outcomes, such as crash types, severity and number of injuries, based on contextual and environmental factors. The proposed model, CrashLLM, distinguishes itself from existing solutions by leveraging the inherent text reasoning capabilities of LLMs to parse and learn from complex, unstructured data, thereby enabling a more nuanced analysis of contributing factors. Our experiments results shows that our LLM-based approach not only predicts the severity of accidents but also classifies different types of accidents and predicts injury outcomes, all with averaged F1 score boosted from 34.9% to 53.8%. Furthermore, CrashLLM can provide valuable insights for numerous open-world what-if situational-awareness traffic safety analyses with learned reasoning features, which existing models cannot offer. We make our benchmark, datasets, and model public available for further exploration.
Related papers
- An Explainable Machine Learning Approach to Traffic Accident Fatality Prediction [0.02730969268472861]
Road traffic accidents pose a significant public health threat worldwide.
This study presents a machine learning-based approach for classifying fatal and non-fatal road accident outcomes.
arXiv Detail & Related papers (2024-09-18T12:41:56Z) - Exploring Traffic Crash Narratives in Jordan Using Text Mining Analytics [4.465427147188149]
This study collected crash data from five major freeways in Jordan that cover narratives of 7,587 records from 2018-2022.
An unsupervised learning method was adopted to learn the pattern from crash data.
Results show that text mining analytics is a promising method and underscore the multifactorial nature of traffic crashes.
arXiv Detail & Related papers (2024-06-11T20:07:39Z) - Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization.
We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data.
We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z) - AccidentGPT: Large Multi-Modal Foundation Model for Traffic Accident
Analysis [3.8763079966791523]
AccidentGPT is a foundation model of traffic accident analysis.
It incorporates multi-modal input data to automatically reconstruct the accident process video with dynamics details.
arXiv Detail & Related papers (2024-01-05T19:33:21Z) - Exploring Factors Affecting Pedestrian Crash Severity Using TabNet: A
Deep Learning Approach [0.0]
This study presents the first investigation of pedestrian crash severity using the TabNet model.
Through the application of TabNet to a comprehensive dataset from Utah covering the years 2010 to 2022, we uncover intricate factors contributing to pedestrian crash severity.
arXiv Detail & Related papers (2023-11-29T19:44:52Z) - A Study of Situational Reasoning for Traffic Understanding [63.45021731775964]
We devise three novel text-based tasks for situational reasoning in the traffic domain.
We adopt four knowledge-enhanced methods that have shown generalization capability across language reasoning tasks in prior work.
We provide in-depth analyses of model performance on data partitions and examine model predictions categorically.
arXiv Detail & Related papers (2023-06-05T01:01:12Z) - Predicting Seriousness of Injury in a Traffic Accident: A New Imbalanced
Dataset and Benchmark [62.997667081978825]
The paper introduces a new dataset to assess the performance of machine learning algorithms in the prediction of the seriousness of injury in a traffic accident.
The dataset is created by aggregating publicly available datasets from the UK Department for Transport.
arXiv Detail & Related papers (2022-05-20T21:15:26Z) - DRFLM: Distributionally Robust Federated Learning with Inter-client
Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data.
We propose a general framework to solve the above two challenges simultaneously.
We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z) - Crash Report Data Analysis for Creating Scenario-Wise, Spatio-Temporal
Attention Guidance to Support Computer Vision-based Perception of Fatal Crash
Risks [8.34084323253809]
This paper develops a data analytics model, named scenario-wise, Spatio-temporal attention guidance, from fatal crash report data.
It estimates the relevance of detected objects to fatal crashes from their environment and context information.
The paper shows how the developed attention guidance supports the design and implementation of a preliminary CV model.
arXiv Detail & Related papers (2021-09-06T19:43:37Z) - A model for traffic incident prediction using emergency braking data [77.34726150561087]
We address the fundamental problem of data scarcity in road traffic accident prediction by training our model on emergency braking events instead of accidents.
We present a prototype implementing a traffic incident prediction model for Germany based on emergency braking data from Mercedes-Benz vehicles.
arXiv Detail & Related papers (2021-02-12T18:17:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.