Improving Cause-of-Death Classification from Verbal Autopsy Reports
- URL: http://arxiv.org/abs/2210.17161v1
- Date: Mon, 31 Oct 2022 09:14:08 GMT
- Title: Improving Cause-of-Death Classification from Verbal Autopsy Reports
- Authors: Thokozile Manaka, Terence van Zyl, Deepak Kar
- Abstract summary: Natural language processing (NLP) techniques have fared poorly in the health sector.
A cause of death is often determined by a verbal autopsy (VA) report in places without reliable death registration systems.
We present a system that relies on two transfer learning paradigms of monolingual learning and multi-source domain adaptation.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In many lower-and-middle income countries including South Africa, data access
in health facilities is restricted due to patient privacy and confidentiality
policies. Further, since clinical data is unique to individual institutions and
laboratories, there are insufficient data annotation standards and conventions.
As a result of the scarcity of textual data, natural language processing (NLP)
techniques have fared poorly in the health sector. A cause of death (COD) is
often determined by a verbal autopsy (VA) report in places without reliable
death registration systems. A non-clinician field worker does a VA report using
a set of standardized questions as a guide to uncover symptoms of a COD. This
analysis focuses on the textual part of the VA report as a case study to
address the challenge of adapting NLP techniques in the health domain. We
present a system that relies on two transfer learning paradigms of monolingual
learning and multi-source domain adaptation to improve VA narratives for the
target task of the COD classification. We use the Bidirectional Encoder
Representations from Transformers (BERT) and Embeddings from Language Models
(ELMo) models pre-trained on the general English and health domains to extract
features from the VA narratives. Our findings suggest that this transfer
learning system improves the COD classification tasks and that the narrative
text contains valuable information for figuring out a COD. Our results further
show that combining binary VA features and narrative text features learned via
this framework boosts the classification task of COD.
Related papers
- Contrastive Learning with Counterfactual Explanations for Radiology Report Generation [83.30609465252441]
We propose a textbfCountertextbfFactual textbfExplanations-based framework (CoFE) for radiology report generation.
Counterfactual explanations serve as a potent tool for understanding how decisions made by algorithms can be changed by asking what if'' scenarios.
Experiments on two benchmarks demonstrate that leveraging the counterfactual explanations enables CoFE to generate semantically coherent and factually complete reports.
arXiv Detail & Related papers (2024-07-19T17:24:25Z) - From Narratives to Numbers: Valid Inference Using Language Model Predictions from Verbal Autopsy Narratives [5.730469631341288]
We develop a method for valid inference using outcomes predicted from free-form text using state-of-the-art NLP techniques.
We leverage a suite of NLP techniques for COD prediction and, through empirical analysis of VA data, demonstrate the effectiveness of our approach in handling transportability issues.
arXiv Detail & Related papers (2024-04-03T03:53:37Z) - Adversarial Training For Low-Resource Disfluency Correction [50.51901599433536]
We propose an adversarially-trained sequence-tagging model for Disfluency Correction (DC)
We show the benefit of our proposed technique, which crucially depends on synthetically generated disfluent data, by evaluating it for DC in three Indian languages.
Our technique also performs well in removing stuttering disfluencies in ASR transcripts introduced by speech impairments.
arXiv Detail & Related papers (2023-06-10T08:58:53Z) - Vision-Language Modelling For Radiological Imaging and Reports In The
Low Data Regime [70.04389979779195]
This paper explores training medical vision-language models (VLMs) where the visual and language inputs are embedded into a common space.
We explore several candidate methods to improve low-data performance, including adapting generic pre-trained models to novel image and text domains.
Using text-to-image retrieval as a benchmark, we evaluate the performance of these methods with variable sized training datasets of paired chest X-rays and radiological reports.
arXiv Detail & Related papers (2023-03-30T18:20:00Z) - MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training in
Radiology [40.52487429030841]
We consider enhancing medical visual-language pre-training with domain-specific knowledge, by exploiting the paired image-text reports from the radiological daily practice.
First, unlike existing works that directly process the raw reports, we adopt a novel triplet extraction module to extract the medical-related information.
Second, we propose a novel triplet encoding module with entity translation by querying a knowledge base, to exploit the rich domain knowledge in medical field.
Third, we propose to use a Transformer-based fusion model for spatially aligning the entity description with visual signals at the image patch level, enabling the ability for medical diagnosis
arXiv Detail & Related papers (2023-01-05T18:55:09Z) - Summarizing Patients Problems from Hospital Progress Notes Using
Pre-trained Sequence-to-Sequence Models [9.879960506853145]
Problem list summarization requires a model to understand, abstract, and generate clinical documentation.
We propose a new NLP task that aims to generate a list of problems in a patient's daily care plan using input from the provider's progress notes during hospitalization.
arXiv Detail & Related papers (2022-08-17T17:07:35Z) - Disentangled Learning of Stance and Aspect Topics for Vaccine Attitude
Detection in Social Media [40.61499595293957]
We propose a novel semi-supervised approach for vaccine attitude detection, called VADet.
VADet is able to learn disentangled stance and aspect topics, and outperforms existing aspect-based sentiment analysis models on both stance detection and tweet clustering.
arXiv Detail & Related papers (2022-05-06T15:24:33Z) - Using Machine Learning to Fuse Verbal Autopsy Narratives and Binary
Features in the Analysis of Deaths from Hyperglycaemia [0.0]
Lower-and-middle income countries are faced with challenges arising from a lack of data on cause of death (COD)
A verbal autopsy can provide information about a COD in areas without robust death registration systems.
This study assesses the performance of various machine learning approaches when analyzing both the structured and unstructured components of the VA report.
arXiv Detail & Related papers (2022-04-26T09:14:11Z) - Few-Shot Cross-lingual Transfer for Coarse-grained De-identification of
Code-Mixed Clinical Texts [56.72488923420374]
Pre-trained language models (LMs) have shown great potential for cross-lingual transfer in low-resource settings.
We show the few-shot cross-lingual transfer property of LMs for named recognition (NER) and apply it to solve a low-resource and real-world challenge of code-mixed (Spanish-Catalan) clinical notes de-identification in the stroke.
arXiv Detail & Related papers (2022-04-10T21:46:52Z) - Towards more patient friendly clinical notes through language models and
ontologies [57.51898902864543]
We present a novel approach to automated medical text based on word simplification and language modelling.
We use a new dataset pairs of publicly available medical sentences and a version of them simplified by clinicians.
Our method based on a language model trained on medical forum data generates simpler sentences while preserving both grammar and the original meaning.
arXiv Detail & Related papers (2021-12-23T16:11:19Z) - Self-supervised Answer Retrieval on Clinical Notes [68.87777592015402]
We introduce CAPR, a rule-based self-supervision objective for training Transformer language models for domain-specific passage matching.
We apply our objective in four Transformer-based architectures: Contextual Document Vectors, Bi-, Poly- and Cross-encoders.
We report that CAPR outperforms strong baselines in the retrieval of domain-specific passages and effectively generalizes across rule-based and human-labeled passages.
arXiv Detail & Related papers (2021-08-02T10:42:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.