Using Machine Learning to Fuse Verbal Autopsy Narratives and Binary
Features in the Analysis of Deaths from Hyperglycaemia
- URL: http://arxiv.org/abs/2204.12169v1
- Date: Tue, 26 Apr 2022 09:14:11 GMT
- Title: Using Machine Learning to Fuse Verbal Autopsy Narratives and Binary
Features in the Analysis of Deaths from Hyperglycaemia
- Authors: Thokozile Manaka and Terence Van Zyl and Alisha N Wade and Deepak Kar
- Abstract summary: Lower-and-middle income countries are faced with challenges arising from a lack of data on cause of death (COD)
A verbal autopsy can provide information about a COD in areas without robust death registration systems.
This study assesses the performance of various machine learning approaches when analyzing both the structured and unstructured components of the VA report.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Lower-and-middle income countries are faced with challenges arising from a
lack of data on cause of death (COD), which can limit decisions on population
health and disease management. A verbal autopsy(VA) can provide information
about a COD in areas without robust death registration systems. A VA consists
of structured data, combining numeric and binary features, and unstructured
data as part of an open-ended narrative text. This study assesses the
performance of various machine learning approaches when analyzing both the
structured and unstructured components of the VA report. The algorithms were
trained and tested via cross-validation in the three settings of binary
features, text features and a combination of binary and text features derived
from VA reports from rural South Africa. The results obtained indicate
narrative text features contain valuable information for determining COD and
that a combination of binary and text features improves the automated COD
classification task.
Keywords: Diabetes Mellitus, Verbal Autopsy, Cause of Death, Machine
Learning, Natural Language Processing
Related papers
- NeuroXVocal: Detection and Explanation of Alzheimer's Disease through Non-invasive Analysis of Picture-prompted Speech [4.815952991777717]
NeuroXVocal is a novel dual-component system that classifies and explains potential Alzheimer's Disease (AD) cases through speech analysis.
The classification component (Neuro) processes three distinct data streams: acoustic features capturing speech patterns and voice characteristics, textual features extracted from speech transcriptions, and precomputed embeddings representing linguistic patterns.
The explainability component (XVocal) implements a Retrieval-Augmented Generation (RAG) approach, leveraging Large Language Models combined with a domain-specific knowledge base of AD research literature.
arXiv Detail & Related papers (2025-02-14T12:09:49Z) - Devising a Set of Compact and Explainable Spoken Language Feature for Screening Alzheimer's Disease [52.46922921214341]
Alzheimer's disease (AD) has become one of the most significant health challenges in an aging society.
We devised an explainable and effective feature set that leverages the visual capabilities of a large language model (LLM) and the Term Frequency-Inverse Document Frequency (TF-IDF) model.
Our new features can be well explained and interpreted step by step which enhance the interpretability of automatic AD screening.
arXiv Detail & Related papers (2024-11-28T05:23:22Z) - Text2Data: Low-Resource Data Generation with Textual Control [100.5970757736845]
Text2Data is a novel approach that utilizes unlabeled data to understand the underlying data distribution.
It undergoes finetuning via a novel constraint optimization-based learning objective that ensures controllability and effectively counteracts catastrophic forgetting.
arXiv Detail & Related papers (2024-02-08T03:41:39Z) - Neural Sign Actors: A diffusion model for 3D sign language production from text [51.81647203840081]
Sign Languages (SL) serve as the primary mode of communication for the Deaf and Hard of Hearing communities.
This work makes an important step towards realistic neural sign avatars, bridging the communication gap between Deaf and hearing communities.
arXiv Detail & Related papers (2023-12-05T12:04:34Z) - Exploring Multimodal Approaches for Alzheimer's Disease Detection Using
Patient Speech Transcript and Audio Data [10.782153332144533]
Alzheimer's disease (AD) is a common form of dementia that severely impacts patient health.
This study investigates various methods for detecting AD using patients' speech and transcripts data from the DementiaBank Pitt database.
arXiv Detail & Related papers (2023-07-05T12:40:11Z) - Vision-Language Modelling For Radiological Imaging and Reports In The
Low Data Regime [70.04389979779195]
This paper explores training medical vision-language models (VLMs) where the visual and language inputs are embedded into a common space.
We explore several candidate methods to improve low-data performance, including adapting generic pre-trained models to novel image and text domains.
Using text-to-image retrieval as a benchmark, we evaluate the performance of these methods with variable sized training datasets of paired chest X-rays and radiological reports.
arXiv Detail & Related papers (2023-03-30T18:20:00Z) - Improving Cause-of-Death Classification from Verbal Autopsy Reports [0.0]
Natural language processing (NLP) techniques have fared poorly in the health sector.
A cause of death is often determined by a verbal autopsy (VA) report in places without reliable death registration systems.
We present a system that relies on two transfer learning paradigms of monolingual learning and multi-source domain adaptation.
arXiv Detail & Related papers (2022-10-31T09:14:08Z) - Ontology-Driven and Weakly Supervised Rare Disease Identification from
Clinical Notes [13.096008602034086]
Rare diseases are challenging to be identified due to few cases available for machine learning and the need for data annotation from domain experts.
We propose a method using brain and weak supervision, with recent pre-trained contextual representations from Bi-directional Transformers (e.g. BERT)
The weakly supervised approach is proposed to learn a confirmation phenotype model to improve Text-to-UMLS linking, without annotated data from domain experts.
arXiv Detail & Related papers (2022-05-11T17:38:24Z) - Leveraging Pre-trained Language Model for Speech Sentiment Analysis [58.78839114092951]
We explore the use of pre-trained language models to learn sentiment information of written texts for speech sentiment analysis.
We propose a pseudo label-based semi-supervised training strategy using a language model on an end-to-end speech sentiment approach.
arXiv Detail & Related papers (2021-06-11T20:15:21Z) - Multi-Modal Detection of Alzheimer's Disease from Speech and Text [3.702631194466718]
We propose a deep learning method that utilizes speech and the corresponding transcript simultaneously to detect Alzheimer's disease (AD)
The proposed method achieves 85.3% 10-fold cross-validation accuracy when trained and evaluated on the Dementiabank Pitt corpus.
arXiv Detail & Related papers (2020-11-30T21:18:17Z) - Text Mining to Identify and Extract Novel Disease Treatments From
Unstructured Datasets [56.38623317907416]
We use Google Cloud to transcribe podcast episodes of an NPR radio show.
We then build a pipeline for systematically pre-processing the text.
Our model successfully identified that Omeprazole can help treat heartburn.
arXiv Detail & Related papers (2020-10-22T19:52:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.