Related papers: Topic Modeling Analysis of Aviation Accident Reports: A Comparative Study between LDA and NMF Models

Topic Modeling Analysis of Aviation Accident Reports: A Comparative Study between LDA and NMF Models

URL: http://arxiv.org/abs/2403.04788v1
Date: Mon, 4 Mar 2024 01:41:07 GMT
Title: Topic Modeling Analysis of Aviation Accident Reports: A Comparative Study between LDA and NMF Models
Authors: Aziida Nanyonga, Hassan Wasswa and Graham Wild
Abstract summary: This paper compares two prominent topic modeling techniques, Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF) LDA demonstrates higher topic coherence, indicating stronger semantic relevance among words within topics. NMF excelled in producing distinct and granular topics, enabling a more focused analysis of specific aspects of aviation accidents.
Score: 0.0
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Aviation safety is paramount in the modern world, with a continuous commitment to reducing accidents and improving safety standards. Central to this endeavor is the analysis of aviation accident reports, rich textual resources that hold insights into the causes and contributing factors behind aviation mishaps. This paper compares two prominent topic modeling techniques, Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF), in the context of aviation accident report analysis. The study leverages the National Transportation Safety Board (NTSB) Dataset with the primary objective of automating and streamlining the process of identifying latent themes and patterns within accident reports. The Coherence Value (C_v) metric was used to evaluate the quality of generated topics. LDA demonstrates higher topic coherence, indicating stronger semantic relevance among words within topics. At the same time, NMF excelled in producing distinct and granular topics, enabling a more focused analysis of specific aspects of aviation accidents.

Related papers

Utilizing AI for Aviation Post-Accident Analysis Classification [0.0]
The volume of textual data available in aviation safety reports presents a challenge for timely and accurate analysis.<n>This paper examines how Artificial Intelligence (AI) and, specifically, Natural Language Processing (NLP) can automate the process of extracting valuable insights from this data.<n>The findings demonstrate that both NLP and deep learning, as well as TM, can significantly improve the efficiency and accuracy of aviation safety analysis.
arXiv Detail & Related papers (2025-05-30T19:15:04Z)
Advancing Neural Network Verification through Hierarchical Safety Abstract Interpretation [52.626086874715284]
We introduce a novel problem formulation called Abstract DNN-Verification, which verifies a hierarchical structure of unsafe outputs.<n>By leveraging abstract interpretation and reasoning about output reachable sets, our approach enables assessing multiple safety levels during the formal verification process.<n>Our contributions include a theoretical exploration of the relationship between our novel abstract safety formulation and existing approaches.
arXiv Detail & Related papers (2025-05-08T13:29:46Z)
Fundamental Safety-Capability Trade-offs in Fine-tuning Large Language Models [92.38300626647342]
Fine-tuning Large Language Models (LLMs) on some task-specific datasets has been a primary use of LLMs. This paper presents a theoretical framework for understanding the interplay between safety and capability in two primary safety-aware LLM fine-tuning strategies.
arXiv Detail & Related papers (2025-03-24T20:41:57Z)
Exploring Aviation Incident Narratives Using Topic Modeling and Clustering Techniques [0.0]
This study applies advanced natural language processing (NLP) techniques to the National Transportation Safety Board (NTSB) dataset. Main objectives are identifying latent themes, exploring semantic relationships, assessing probabilistic connections, and cluster incidents based on shared characteristics. Comparative analysis reveals that LDA performed best with a coherence value of 0.597, pLSA of 0.583, LSA of 0.542, and NMF of 0.437.
arXiv Detail & Related papers (2025-01-14T08:23:15Z)
Natural Language Processing and Deep Learning Models to Classify Phase of Flight in Aviation Safety Occurrences [14.379311972506791]
Researchers applied natural language processing (NLP) and artificial intelligence (AI) models to process text narratives to classify the flight phases of safety occurrences. The classification performance of two deep learning models, ResNet and sRNN was evaluated, using an initial dataset of 27,000 safety occurrence reports from the NTSB.
arXiv Detail & Related papers (2025-01-11T15:02:49Z)
Analyzing Aviation Safety Narratives with LDA, NMF and PLSA: A Case Study Using Socrata Datasets [0.0]
This study explores the application of topic modelling techniques on the Socrata dataset spanning from 1908 to 2009. The analysis identified key themes such as pilot error, mechanical failure, weather conditions, and training deficiencies. Future directions include integrating additional contextual variables, leveraging neural topic models, and enhancing aviation safety protocols.
arXiv Detail & Related papers (2025-01-03T08:14:39Z)
Comparative Analysis of Topic Modeling Techniques on ATSB Text Narratives Using Natural Language Processing [0.0]
This paper explores the application of four prominent topic modelling techniques, namely Probabilistic Latent Semantic Analysis (pLSA), Latent Semantic Analysis (LSA), Latent Dirichlet Allocation (LDA), and Non-negative Matrix Factorization (NMF) The study examines each technique's ability to unveil latent thematic structures within the data, providing safety professionals with a systematic approach to gain actionable insights.
arXiv Detail & Related papers (2025-01-02T12:21:07Z)
On-Road Object Importance Estimation: A New Dataset and A Model with Multi-Fold Top-Down Guidance [70.80612792049315]
This paper contributes a new large-scale dataset named Traffic Object Importance (TOI) It proposes a model that integrates multi-fold top-down guidance with the bottom-up feature. Our model outperforms state-of-the-art methods by large margins.
arXiv Detail & Related papers (2024-11-26T06:37:10Z)
An Explainable Machine Learning Approach to Traffic Accident Fatality Prediction [0.02730969268472861]
Road traffic accidents pose a significant public health threat worldwide. This study presents a machine learning-based approach for classifying fatal and non-fatal road accident outcomes.
arXiv Detail & Related papers (2024-09-18T12:41:56Z)
Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses [76.59021017301127]
We propose a large-scale traffic crash language dataset, named CrashEvent, summarizing 19,340 real-world crash reports. We further formulate the crash event feature learning as a novel text reasoning problem and further fine-tune various large language models (LLMs) to predict detailed accident outcomes. Our experiments results show that our LLM-based approach not only predicts the severity of accidents but also classifies different types of accidents and predicts injury outcomes.
arXiv Detail & Related papers (2024-06-16T03:10:16Z)
DRUformer: Enhancing the driving scene Important object detection with driving relationship self-understanding [50.81809690183755]
Traffic accidents frequently lead to fatal injuries, contributing to over 50 million deaths until 2023. Previous research primarily assessed the importance of individual participants, treating them as independent entities. We introduce Driving scene Relationship self-Understanding transformer (DRUformer) to enhance the important object detection task.
arXiv Detail & Related papers (2023-11-11T07:26:47Z)
Aviation Safety Risk Analysis and Flight Technology Assessment Issues [0.0]
It focuses on two main areas: analyzing exceedance events and statistically evaluating non-exceedance data. The proposed solutions involve data preprocessing, reliability assessment, quantifying flight control using neural networks, exploratory data analysis, and establishing real-time automated warnings.
arXiv Detail & Related papers (2023-08-10T14:13:49Z)
A Counterfactual Safety Margin Perspective on the Scoring of Autonomous Vehicles' Riskiness [52.27309191283943]
This paper presents a data-driven framework for assessing the risk of different AVs' behaviors. We propose the notion of counterfactual safety margin, which represents the minimum deviation from nominal behavior that could cause a collision.
arXiv Detail & Related papers (2023-08-02T09:48:08Z)
Are Neural Topic Models Broken? [81.15470302729638]
We study the relationship between automated and human evaluation of topic models. We find that neural topic models fare worse in both respects compared to an established classical method.
arXiv Detail & Related papers (2022-10-28T14:38:50Z)
Crash Report Data Analysis for Creating Scenario-Wise, Spatio-Temporal Attention Guidance to Support Computer Vision-based Perception of Fatal Crash Risks [8.34084323253809]
This paper develops a data analytics model, named scenario-wise, Spatio-temporal attention guidance, from fatal crash report data. It estimates the relevance of detected objects to fatal crashes from their environment and context information. The paper shows how the developed attention guidance supports the design and implementation of a preliminary CV model.
arXiv Detail & Related papers (2021-09-06T19:43:37Z)
A model for traffic incident prediction using emergency braking data [77.34726150561087]
We address the fundamental problem of data scarcity in road traffic accident prediction by training our model on emergency braking events instead of accidents. We present a prototype implementing a traffic incident prediction model for Germany based on emergency braking data from Mercedes-Benz vehicles.
arXiv Detail & Related papers (2021-02-12T18:17:12Z)
Discovering Airline-Specific Business Intelligence from Online Passenger Reviews: An Unsupervised Text Analytics Approach [3.2872586139884623]
Airlines can capitalize on the abundantly available online customer reviews (OCR) This paper is to discover company- and competitor-specific intelligence from OCR using an unsupervised text analytics approach. A case study involving 99,147 airline reviews of a US-based target carrier and four of its competitors is used to validate the proposed approach.
arXiv Detail & Related papers (2020-12-14T23:09:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.