Fine-tuning pre-trained extractive QA models for clinical document
parsing
- URL: http://arxiv.org/abs/2312.02314v1
- Date: Mon, 4 Dec 2023 19:52:56 GMT
- Title: Fine-tuning pre-trained extractive QA models for clinical document
parsing
- Authors: Ashwyn Sharma, David I. Feldman, Aneesh Jain
- Abstract summary: A remote patient monitoring program for Heart Failure (HF) patients needs to have access to clinical markers like EF (Ejection Fraction) or LVEF (Left Ventricular Ejection Fraction)
This paper explains a system that can parse echocardiogram reports and verify EF values.
We found that the system saved over 1500 hours for our clinicians over 12 months by automating the task at scale.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Electronic health records (EHRs) contain a vast amount of high-dimensional
multi-modal data that can accurately represent a patient's medical history.
Unfortunately, most of this data is either unstructured or semi-structured,
rendering it unsuitable for real-time and retrospective analyses. A remote
patient monitoring (RPM) program for Heart Failure (HF) patients needs to have
access to clinical markers like EF (Ejection Fraction) or LVEF (Left
Ventricular Ejection Fraction) in order to ascertain eligibility and
appropriateness for the program. This paper explains a system that can parse
echocardiogram reports and verify EF values. This system helps identify
eligible HF patients who can be enrolled in such a program. At the heart of
this system is a pre-trained extractive QA transformer model that is fine-tuned
on custom-labeled data. The methods used to prepare such a model for deployment
are illustrated by running experiments on a public clinical dataset like
MIMIC-IV-Note. The pipeline can be used to generalize solutions to similar
problems in a low-resource setting. We found that the system saved over 1500
hours for our clinicians over 12 months by automating the task at scale.
Related papers
- Continuous max-flow augmentation of self-supervised few-shot learning on SPECT left ventricles [0.0]
This paper aims to give a recipe for diagnostic centers as well as for clinics to automatically segment the myocardium based on small and low-quality labels on reconstructed SPECT.
A combination of Continuous Max-Flow (CMF) with prior shape information is developed to augment the 3D U-Net self-supervised learning (SSL) approach on various geometries of SPECT apparatus.
arXiv Detail & Related papers (2024-05-09T03:19:19Z) - BESTMVQA: A Benchmark Evaluation System for Medical Visual Question
Answering [8.547600133510551]
This paper develops a Benchmark Evaluation SysTem for Medical Visual Question Answering, denoted by BESTMVQA.
Our system provides a useful tool for users to automatically build Med-VQA datasets, which helps overcoming the data insufficient problem.
With simple configurations, our system automatically trains and evaluates the selected models over a benchmark dataset.
arXiv Detail & Related papers (2023-12-13T03:08:48Z) - Multimodal Pretraining of Medical Time Series and Notes [45.89025874396911]
Deep learning models show promise in extracting meaningful patterns, but they require extensive labeled data.
We propose a novel approach employing self-supervised pretraining, focusing on the alignment of clinical measurements and notes.
In downstream tasks, including in-hospital mortality prediction and phenotyping, our model outperforms baselines in settings where only a fraction of the data is labeled.
arXiv Detail & Related papers (2023-12-11T21:53:40Z) - Investigating Alternative Feature Extraction Pipelines For Clinical Note
Phenotyping [0.0]
Using computational systems for the extraction of medical attributes offers many applications.
BERT-based models can be used to transform clinical notes into a series of representations.
We propose an alternative pipeline utilizing ScispaCyNeumann for extraction of common diseases.
arXiv Detail & Related papers (2023-10-05T02:51:51Z) - Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review
and Replicability Study [60.56194508762205]
We reproduce, compare, and analyze state-of-the-art automated medical coding machine learning models.
We show that several models underperform due to weak configurations, poorly sampled train-test splits, and insufficient evaluation.
We present the first comprehensive results on the newly released MIMIC-IV dataset using the reproduced models.
arXiv Detail & Related papers (2023-04-21T11:54:44Z) - Preservation of High Frequency Content for Deep Learning-Based Medical
Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists.
We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z) - Understanding Heart-Failure Patients EHR Clinical Features via SHAP
Interpretation of Tree-Based Machine Learning Model Predictions [8.444557621643568]
Heart failure (HF) is a major cause of mortality.
We examined whether machine learning models, more specifically the XGBoost model, can accurately predict patient stage based on EHR.
Our results indicate that based on structured data from EHR, our models could predict patients' ejection fraction (EF) scores with moderate accuracy.
arXiv Detail & Related papers (2021-03-20T22:17:05Z) - BiteNet: Bidirectional Temporal Encoder Network to Predict Medical
Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey.
An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys.
We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z) - Hemogram Data as a Tool for Decision-making in COVID-19 Management:
Applications to Resource Scarcity Scenarios [62.997667081978825]
COVID-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure.
This work describes a machine learning model derived from hemogram exam data performed in symptomatic patients.
Proposed models can predict COVID-19 qRT-PCR results in symptomatic individuals with high accuracy, sensitivity and specificity.
arXiv Detail & Related papers (2020-05-10T01:45:03Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z) - DeepEnroll: Patient-Trial Matching with Deep Embedding and Entailment
Prediction [67.91606509226132]
Clinical trials are essential for drug development but often suffer from expensive, inaccurate and insufficient patient recruitment.
DeepEnroll is a cross-modal inference learning model to jointly encode enrollment criteria (tabular data) into a shared latent space for matching inference.
arXiv Detail & Related papers (2020-01-22T17:51:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.