Topic Modeling Analysis of Aviation Accident Reports: A Comparative
Study between LDA and NMF Models
- URL: http://arxiv.org/abs/2403.04788v1
- Date: Mon, 4 Mar 2024 01:41:07 GMT
- Title: Topic Modeling Analysis of Aviation Accident Reports: A Comparative
Study between LDA and NMF Models
- Authors: Aziida Nanyonga, Hassan Wasswa and Graham Wild
- Abstract summary: This paper compares two prominent topic modeling techniques, Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF)
LDA demonstrates higher topic coherence, indicating stronger semantic relevance among words within topics.
NMF excelled in producing distinct and granular topics, enabling a more focused analysis of specific aspects of aviation accidents.
- Score: 0.0
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Aviation safety is paramount in the modern world, with a continuous
commitment to reducing accidents and improving safety standards. Central to
this endeavor is the analysis of aviation accident reports, rich textual
resources that hold insights into the causes and contributing factors behind
aviation mishaps. This paper compares two prominent topic modeling techniques,
Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF),
in the context of aviation accident report analysis. The study leverages the
National Transportation Safety Board (NTSB) Dataset with the primary objective
of automating and streamlining the process of identifying latent themes and
patterns within accident reports. The Coherence Value (C_v) metric was used to
evaluate the quality of generated topics. LDA demonstrates higher topic
coherence, indicating stronger semantic relevance among words within topics. At
the same time, NMF excelled in producing distinct and granular topics, enabling
a more focused analysis of specific aspects of aviation accidents.
Related papers
- Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses [76.59021017301127]
We propose a large-scale traffic crash language dataset, named CrashEvent, summarizing 19,340 real-world crash reports.
We further formulate the crash event feature learning as a novel text reasoning problem and further fine-tune various large language models (LLMs) to predict detailed accident outcomes.
Our experiments results show that our LLM-based approach not only predicts the severity of accidents but also classifies different types of accidents and predicts injury outcomes.
arXiv Detail & Related papers (2024-06-16T03:10:16Z) - AccidentGPT: Large Multi-Modal Foundation Model for Traffic Accident
Analysis [3.8763079966791523]
AccidentGPT is a foundation model of traffic accident analysis.
It incorporates multi-modal input data to automatically reconstruct the accident process video with dynamics details.
arXiv Detail & Related papers (2024-01-05T19:33:21Z) - DRUformer: Enhancing the driving scene Important object detection with
driving relationship self-understanding [50.81809690183755]
Traffic accidents frequently lead to fatal injuries, contributing to over 50 million deaths until 2023.
Previous research primarily assessed the importance of individual participants, treating them as independent entities.
We introduce Driving scene Relationship self-Understanding transformer (DRUformer) to enhance the important object detection task.
arXiv Detail & Related papers (2023-11-11T07:26:47Z) - Aviation Safety Risk Analysis and Flight Technology Assessment Issues [0.0]
It focuses on two main areas: analyzing exceedance events and statistically evaluating non-exceedance data.
The proposed solutions involve data preprocessing, reliability assessment, quantifying flight control using neural networks, exploratory data analysis, and establishing real-time automated warnings.
arXiv Detail & Related papers (2023-08-10T14:13:49Z) - A Counterfactual Safety Margin Perspective on the Scoring of Autonomous
Vehicles' Riskiness [52.27309191283943]
This paper presents a data-driven framework for assessing the risk of different AVs' behaviors.
We propose the notion of counterfactual safety margin, which represents the minimum deviation from nominal behavior that could cause a collision.
arXiv Detail & Related papers (2023-08-02T09:48:08Z) - Are Neural Topic Models Broken? [81.15470302729638]
We study the relationship between automated and human evaluation of topic models.
We find that neural topic models fare worse in both respects compared to an established classical method.
arXiv Detail & Related papers (2022-10-28T14:38:50Z) - LaMDA: Language Models for Dialog Applications [75.75051929981933]
LaMDA is a family of Transformer-based neural language models specialized for dialog.
Fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements.
arXiv Detail & Related papers (2022-01-20T15:44:37Z) - Crash Report Data Analysis for Creating Scenario-Wise, Spatio-Temporal
Attention Guidance to Support Computer Vision-based Perception of Fatal Crash
Risks [8.34084323253809]
This paper develops a data analytics model, named scenario-wise, Spatio-temporal attention guidance, from fatal crash report data.
It estimates the relevance of detected objects to fatal crashes from their environment and context information.
The paper shows how the developed attention guidance supports the design and implementation of a preliminary CV model.
arXiv Detail & Related papers (2021-09-06T19:43:37Z) - A model for traffic incident prediction using emergency braking data [77.34726150561087]
We address the fundamental problem of data scarcity in road traffic accident prediction by training our model on emergency braking events instead of accidents.
We present a prototype implementing a traffic incident prediction model for Germany based on emergency braking data from Mercedes-Benz vehicles.
arXiv Detail & Related papers (2021-02-12T18:17:12Z) - Discovering Airline-Specific Business Intelligence from Online Passenger
Reviews: An Unsupervised Text Analytics Approach [3.2872586139884623]
Airlines can capitalize on the abundantly available online customer reviews (OCR)
This paper is to discover company- and competitor-specific intelligence from OCR using an unsupervised text analytics approach.
A case study involving 99,147 airline reviews of a US-based target carrier and four of its competitors is used to validate the proposed approach.
arXiv Detail & Related papers (2020-12-14T23:09:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.