Related papers: Domain-Adapted Pre-trained Language Models for Implicit Information Extraction in Crash Narratives

Domain-Adapted Pre-trained Language Models for Implicit Information Extraction in Crash Narratives

URL: http://arxiv.org/abs/2510.09434v1
Date: Fri, 10 Oct 2025 14:45:07 GMT
Title: Domain-Adapted Pre-trained Language Models for Implicit Information Extraction in Crash Narratives
Authors: Xixi Wang, Jordanka Kovaceva, Miguel Costa, Shuai Wang, Francisco Camara Pereira, Robert Thomson,
Abstract summary: We study whether compact open-source language models can support reasoning-intensive extraction from crash narratives.<n>We apply fine-tuning techniques to inject task-specific knowledge to LLMs with Low-Rank Adaption (LoRA) and BERT.<n>Further analysis reveals that the fine-tuned PLMs can capture richer narrative details and even correct some mislabeled annotations in the dataset.
Score: 6.91741018994547
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Free-text crash narratives recorded in real-world crash databases have been shown to play a significant role in improving traffic safety. However, large-scale analyses remain difficult to implement as there are no documented tools that can batch process the unstructured, non standardized text content written by various authors with diverse experience and attention to detail. In recent years, Transformer-based pre-trained language models (PLMs), such as Bidirectional Encoder Representations from Transformers (BERT) and large language models (LLMs), have demonstrated strong capabilities across various natural language processing tasks. These models can extract explicit facts from crash narratives, but their performance declines on inference-heavy tasks in, for example, Crash Type identification, which can involve nearly 100 categories. Moreover, relying on closed LLMs through external APIs raises privacy concerns for sensitive crash data. Additionally, these black-box tools often underperform due to limited domain knowledge. Motivated by these challenges, we study whether compact open-source PLMs can support reasoning-intensive extraction from crash narratives. We target two challenging objectives: 1) identifying the Manner of Collision for a crash, and 2) Crash Type for each vehicle involved in the crash event from real-world crash narratives. To bridge domain gaps, we apply fine-tuning techniques to inject task-specific knowledge to LLMs with Low-Rank Adaption (LoRA) and BERT. Experiments on the authoritative real-world dataset Crash Investigation Sampling System (CISS) demonstrate that our fine-tuned compact models outperform strong closed LLMs, such as GPT-4o, while requiring only minimal training resources. Further analysis reveals that the fine-tuned PLMs can capture richer narrative details and even correct some mislabeled annotations in the dataset.

Related papers

CrashChat: A Multimodal Large Language Model for Multitask Traffic Crash Video Analysis [8.067631013051855]
This paper proposes CrashChat, a multimodal large language model (MLLM) for multitask traffic analysis.<n>CrashChat acquires domain-specific knowledge through instruction fine-tuning and employs a novel multitask learning strategy.<n> Numerical experiments on consolidated public datasets demonstrate that CrashChat consistently outperforms existing MLLMs.
arXiv Detail & Related papers (2025-12-21T20:39:31Z)
Hey, That's My Data! Label-Only Dataset Inference in Large Language Models [63.35066172530291]
CatShift is a label-only dataset-inference framework.<n>It capitalizes on catastrophic forgetting: the tendency of an LLM to overwrite previously learned knowledge when exposed to new data.
arXiv Detail & Related papers (2025-06-06T13:02:59Z)
Towards Reliable and Interpretable Traffic Crash Pattern Prediction and Safety Interventions Using Customized Large Language Models [14.53510262691888]
TrafficSafe is a framework that adapts to reframe crash prediction and feature attribution as text-level reasoning.<n>Alcohol-impaired driving is the leading factor in severe crashes.<n>TrafficSafe highlights pivotal features during model training guiding strategic crash data collection improvements.
arXiv Detail & Related papers (2025-05-18T21:02:30Z)
CrashSage: A Large Language Model-Centered Framework for Contextual and Interpretable Traffic Crash Analysis [0.46040036610482665]
Road crashes claim over 1.3 million lives annually worldwide and incur global economic losses exceeding $1.8 trillion.<n>This study presents CrashSage, a novel Large Language Model (LLM)-centered framework designed to advance crash analysis and modeling through four key innovations.
arXiv Detail & Related papers (2025-05-08T00:23:18Z)
Do We Really Need Curated Malicious Data for Safety Alignment in Multi-modal Large Language Models? [83.53005932513155]
Multi-modal large language models (MLLMs) have made significant progress, yet their safety alignment remains limited.<n>We propose finetuning MLLMs on a small set of benign instruct-following data with responses replaced by simple, clear rejection sentences.
arXiv Detail & Related papers (2025-04-14T09:03:51Z)
New Dataset and Methods for Fine-Grained Compositional Referring Expression Comprehension via Specialist-MLLM Collaboration [49.180693704510006]
Referring Expression (REC) is a cross-modal task that evaluates the interplay of language understanding, image comprehension, and language-to-image grounding.<n>It serves as an essential testing ground for Multimodal Large Language Models (MLLMs)
arXiv Detail & Related papers (2025-02-27T13:58:44Z)
Evaluating Robustness of LLMs on Crisis-Related Microblogs across Events, Information Types, and Linguistic Features [15.844270609527848]
Microblogging platforms like X provide real-time information to governments during disasters.<n>Traditionally, supervised machine learning models have been used, but they lack generalizability.<n>This paper provides a detailed analysis of the performance of six well-known Large Language Models (LLMs) in processing disaster-related social media data.
arXiv Detail & Related papers (2024-12-08T10:30:29Z)
Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses [76.59021017301127]
We propose a large-scale traffic crash language dataset, named CrashEvent, summarizing 19,340 real-world crash reports. We further formulate the crash event feature learning as a novel text reasoning problem and further fine-tune various large language models (LLMs) to predict detailed accident outcomes. Our experiments results show that our LLM-based approach not only predicts the severity of accidents but also classifies different types of accidents and predicts injury outcomes.
arXiv Detail & Related papers (2024-06-16T03:10:16Z)
MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning [63.80739044622555]
We introduce MuSR, a dataset for evaluating language models on soft reasoning tasks specified in a natural language narrative. This dataset has two crucial features. First, it is created through a novel neurosymbolic synthetic-to-natural generation algorithm. Second, our dataset instances are free text narratives corresponding to real-world domains of reasoning.
arXiv Detail & Related papers (2023-10-24T17:59:20Z)
ReEval: Automatic Hallucination Evaluation for Retrieval-Augmented Large Language Models via Transferable Adversarial Attacks [91.55895047448249]
This paper presents ReEval, an LLM-based framework using prompt chaining to perturb the original evidence for generating new test cases. We implement ReEval using ChatGPT and evaluate the resulting variants of two popular open-domain QA datasets. Our generated data is human-readable and useful to trigger hallucination in large language models.
arXiv Detail & Related papers (2023-10-19T06:37:32Z)
Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks. We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset. The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.