Multimodal Pretraining of Medical Time Series and Notes
- URL: http://arxiv.org/abs/2312.06855v1
- Date: Mon, 11 Dec 2023 21:53:40 GMT
- Title: Multimodal Pretraining of Medical Time Series and Notes
- Authors: Ryan King, Tianbao Yang, Bobak Mortazavi
- Abstract summary: Deep learning models show promise in extracting meaningful patterns, but they require extensive labeled data.
We propose a novel approach employing self-supervised pretraining, focusing on the alignment of clinical measurements and notes.
In downstream tasks, including in-hospital mortality prediction and phenotyping, our model outperforms baselines in settings where only a fraction of the data is labeled.
- Score: 45.89025874396911
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Within the intensive care unit (ICU), a wealth of patient data, including
clinical measurements and clinical notes, is readily available. This data is a
valuable resource for comprehending patient health and informing medical
decisions, but it also contains many challenges in analysis. Deep learning
models show promise in extracting meaningful patterns, but they require
extensive labeled data, a challenge in critical care. To address this, we
propose a novel approach employing self-supervised pretraining, focusing on the
alignment of clinical measurements and notes. Our approach combines contrastive
and masked token prediction tasks during pretraining. Semi-supervised
experiments on the MIMIC-III dataset demonstrate the effectiveness of our
self-supervised pretraining. In downstream tasks, including in-hospital
mortality prediction and phenotyping, our pretrained model outperforms
baselines in settings where only a fraction of the data is labeled, emphasizing
its ability to enhance ICU data analysis. Notably, our method excels in
situations where very few labels are available, as evidenced by an increase in
the AUC-ROC for in-hospital mortality by 0.17 and in AUC-PR for phenotyping by
0.1 when only 1% of labels are accessible. This work advances self-supervised
learning in the healthcare domain, optimizing clinical insights from abundant
yet challenging ICU data.
Related papers
- Pruning the Way to Reliable Policies: A Multi-Objective Deep Q-Learning
Approach to Critical Care [68.8204255655161]
We introduce a deep Q-learning approach able to obtain more reliable critical care policies.
We achieve this by first pruning the action set based on all available rewards, and second training a final model based on the sparse main reward but with a restricted action set.
arXiv Detail & Related papers (2023-06-13T18:02:57Z) - FineEHR: Refine Clinical Note Representations to Improve Mortality
Prediction [3.9026461169566673]
Large-scale electronic health records provide machine learning models with an abundance of clinical text and vital sign data.
Despite the emergence of advanced Natural Language Processing (NLP) algorithms for clinical note analysis, the complex textual structure and noise present in raw clinical data have posed significant challenges.
We propose FINEEHR, a system that utilizes two representation learning techniques, namely metric learning and fine-tuning, to refine clinical note embeddings.
arXiv Detail & Related papers (2023-04-24T02:42:52Z) - Unsupervised pre-training of graph transformers on patient population
graphs [48.02011627390706]
We propose a graph-transformer-based network to handle heterogeneous clinical data.
We show the benefit of our pre-training method in a self-supervised and a transfer learning setting.
arXiv Detail & Related papers (2022-07-21T16:59:09Z) - Unsupervised Pre-Training on Patient Population Graphs for Patient-Level
Predictions [48.02011627390706]
Pre-training has shown success in different areas of machine learning, such as Computer Vision (CV), Natural Language Processing (NLP) and medical imaging.
In this paper, we apply unsupervised pre-training to heterogeneous, multi-modal EHR data for patient outcome prediction.
We find that our proposed graph based pre-training method helps in modeling the data at a population level.
arXiv Detail & Related papers (2022-03-23T17:59:45Z) - Improving Early Sepsis Prediction with Multi Modal Learning [5.129463113166068]
Clinical text provides essential information to estimate the severity of sepsis.
We employ state-of-the-art NLP models such as BERT and a highly specialized NLP model in Amazon Comprehend Medical to represent the text.
Our methods significantly outperforms a clinical criteria suggested by experts, qSOFA, as well as the winning model of the PhysioNet Computing in Cardiology Challenge for predicting Sepsis.
arXiv Detail & Related papers (2021-07-23T09:25:31Z) - Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks.
Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets.
We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z) - Integrating Physiological Time Series and Clinical Notes with Deep
Learning for Improved ICU Mortality Prediction [21.919977518774015]
We study how physiological time series data and clinical notes can be integrated into a unified mortality prediction model.
Our results show that a late fusion approach can provide a statistically significant improvement in prediction mortality over using individual modalities in isolation.
arXiv Detail & Related papers (2020-03-24T18:25:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.