Enhancing Traffic Accident Classifications: Application of NLP Methods for City Safety
- URL: http://arxiv.org/abs/2506.12092v1
- Date: Wed, 11 Jun 2025 14:50:49 GMT
- Title: Enhancing Traffic Accident Classifications: Application of NLP Methods for City Safety
- Authors: Enes Özeren, Alexander Ulbrich, Sascha Filimon, David Rügamer, Andreas Bender,
- Abstract summary: We analyze traffic incidents in Munich to identify patterns and characteristics that distinguish different types of accidents.<n>The dataset consists of both structured tabular features, such as location, time, and weather conditions, as well as unstructured free-text descriptions detailing the circumstances of each accident.<n>To assess the reliability of labels, we apply NLP methods, including topic modeling and few-shot learning, which reveal inconsistencies in the labeling process.
- Score: 41.76653295869846
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A comprehensive understanding of traffic accidents is essential for improving city safety and informing policy decisions. In this study, we analyze traffic incidents in Munich to identify patterns and characteristics that distinguish different types of accidents. The dataset consists of both structured tabular features, such as location, time, and weather conditions, as well as unstructured free-text descriptions detailing the circumstances of each accident. Each incident is categorized into one of seven predefined classes. To assess the reliability of these labels, we apply NLP methods, including topic modeling and few-shot learning, which reveal inconsistencies in the labeling process. These findings highlight potential ambiguities in accident classification and motivate a refined predictive approach. Building on these insights, we develop a classification model that achieves high accuracy in assigning accidents to their respective categories. Our results demonstrate that textual descriptions contain the most informative features for classification, while the inclusion of tabular data provides only marginal improvements. These findings emphasize the critical role of free-text data in accident analysis and highlight the potential of transformer-based models in improving classification reliability.
Related papers
- Towards Reliable and Interpretable Traffic Crash Pattern Prediction and Safety Interventions Using Customized Large Language Models [14.53510262691888]
TrafficSafe is a framework that adapts to reframe crash prediction and feature attribution as text-level reasoning.<n>Alcohol-impaired driving is the leading factor in severe crashes.<n>TrafficSafe highlights pivotal features during model training guiding strategic crash data collection improvements.
arXiv Detail & Related papers (2025-05-18T21:02:30Z) - Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection [52.490375806093745]
The objective of few-shot object detection (FSOD) is to detect novel objects with few training samples.<n>We introduce the side information to alleviate the negative influences derived from the feature space and sample viewpoints.<n>Our model outperforms the previous state-of-the-art methods, significantly improving the ability of FSOD in most shots/splits.
arXiv Detail & Related papers (2025-04-09T17:24:05Z) - Natural Language Processing and Deep Learning Models to Classify Phase of Flight in Aviation Safety Occurrences [14.379311972506791]
Researchers applied natural language processing (NLP) and artificial intelligence (AI) models to process text narratives to classify the flight phases of safety occurrences.<n>The classification performance of two deep learning models, ResNet and sRNN was evaluated, using an initial dataset of 27,000 safety occurrence reports from the NTSB.
arXiv Detail & Related papers (2025-01-11T15:02:49Z) - Feature Group Tabular Transformer: A Novel Approach to Traffic Crash Modeling and Causality Analysis [0.40964539027092917]
This study introduces a novel approach to predicting collision types by utilizing a comprehensive dataset fused from multiple sources.<n>Central to our approach is the development of a Feature Group Tabular Transformer (FGTT) model, which organizes disparate data into meaningful feature groups.<n>The FGTT model is benchmarked against widely used tree ensemble models, including Random Forest, XGBoost, and CatBoost, demonstrating superior predictive performance.
arXiv Detail & Related papers (2024-12-06T20:47:13Z) - Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses [76.59021017301127]
We propose a large-scale traffic crash language dataset, named CrashEvent, summarizing 19,340 real-world crash reports.
We further formulate the crash event feature learning as a novel text reasoning problem and further fine-tune various large language models (LLMs) to predict detailed accident outcomes.
Our experiments results show that our LLM-based approach not only predicts the severity of accidents but also classifies different types of accidents and predicts injury outcomes.
arXiv Detail & Related papers (2024-06-16T03:10:16Z) - Do as I do (Safely): Mitigating Task-Specific Fine-tuning Risks in Large Language Models [93.08860674071636]
We show how malicious actors can subtly manipulate the structure of almost any task-specific dataset to foster dangerous model behaviors.<n>We propose a novel mitigation strategy that mixes in safety data which mimics the task format and prompting style of the user data.
arXiv Detail & Related papers (2024-06-12T18:33:11Z) - Hierarchical Multi-label Classification for Fine-level Event Extraction from Aviation Accident Reports [18.005377921658308]
This article argues that we can identify the events more accurately by leveraging the event taxonomy.
We achieve this hierarchical classification task by incorporating a novel hierarchical attention module into BERT.
It has been shown that fine-level prediction accuracy is highly improved, and the regularization term can be beneficial to the rare event identification problem.
arXiv Detail & Related papers (2024-03-26T17:51:06Z) - PatchMix Augmentation to Identify Causal Features in Few-shot Learning [55.64873998196191]
Few-shot learning aims to transfer knowledge learned from base with sufficient categories labelled data to novel categories with scarce known information.
We propose a novel data augmentation strategy dubbed as PatchMix that can break this spurious dependency.
We show that such an augmentation mechanism, different from existing ones, is able to identify the causal features.
arXiv Detail & Related papers (2022-11-29T08:41:29Z) - A model for traffic incident prediction using emergency braking data [77.34726150561087]
We address the fundamental problem of data scarcity in road traffic accident prediction by training our model on emergency braking events instead of accidents.
We present a prototype implementing a traffic incident prediction model for Germany based on emergency braking data from Mercedes-Benz vehicles.
arXiv Detail & Related papers (2021-02-12T18:17:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.