Two heads are better than one: Enhancing medical representations by
pre-training over structured and unstructured electronic health records
- URL: http://arxiv.org/abs/2201.10113v1
- Date: Tue, 25 Jan 2022 06:14:49 GMT
- Title: Two heads are better than one: Enhancing medical representations by
pre-training over structured and unstructured electronic health records
- Authors: Sicen Liu, Xiaolong Wang, Yongshuai Hou, Ge Li, Hui Wang, Hui Xu, Yang
Xiang, Buzhou Tang
- Abstract summary: We propose a unified deep learning-based medical pre-trained language model, named UMM-PLM, to automatically learn representative features from multimodal EHRs.
We first developed parallel unimodal information representation modules to capture the unimodal-specific characteristic, where unimodal representations were learned from each data source separately.
A cross-modal module was further introduced to model the interactions between different modalities.
- Score: 23.379185792773875
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The massive context of electronic health records (EHRs) has created enormous
potentials for improving healthcare, among which structured (coded) data and
unstructured (text) data are two important textual modalities. They do not
exist in isolation and can complement each other in most real-life clinical
scenarios. Most existing researches in medical informatics, however, either
only focus on a particular modality or straightforwardly concatenate the
information from different modalities, which ignore the interaction and
information sharing between them. To address these issues, we proposed a
unified deep learning-based medical pre-trained language model, named UMM-PLM,
to automatically learn representative features from multimodal EHRs that
consist of both structured data and unstructured data. Specifically, we first
developed parallel unimodal information representation modules to capture the
unimodal-specific characteristic, where unimodal representations were learned
from each data source separately. A cross-modal module was further introduced
to model the interactions between different modalities. We pre-trained the
model on a large EHRs dataset containing both structured data and unstructured
data and verified the effectiveness of the model on three downstream clinical
tasks, i.e., medication recommendation, 30-day readmission and ICD coding
through extensive experiments. The results demonstrate the power of UMM-PLM
compared with benchmark methods and state-of-the-art baselines. Analyses show
that UMM-PLM can effectively concern with multimodal textual information and
has the potential to provide more comprehensive interpretations for clinical
decision making.
Related papers
- Representation Learning of Structured Data for Medical Foundation Models [29.10129199884847]
We introduce the UniStruct architecture to design a multimodal medical foundation model of unstructured text and structured data.
Our approach is validated through model pre-training on both an extensive internal medical database and a public repository of structured medical records.
arXiv Detail & Related papers (2024-10-17T09:02:28Z) - MEDFuse: Multimodal EHR Data Fusion with Masked Lab-Test Modeling and Large Language Models [11.798375238713488]
MEDFuse is a framework that integrates structured and unstructured medical data.
It achieves over 90% F1 score in the 10-disease multi-label classification task.
arXiv Detail & Related papers (2024-07-17T04:17:09Z) - EMERGE: Integrating RAG for Improved Multimodal EHR Predictive Modeling [22.94521527609479]
EMERGE is a Retrieval-Augmented Generation driven framework aimed at enhancing multimodal EHR predictive modeling.
Our approach extracts entities from both time-series data and clinical notes by prompting Large Language Models.
The extracted knowledge is then used to generate task-relevant summaries of patients' health statuses.
arXiv Detail & Related papers (2024-05-27T10:53:15Z) - Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - Contrastive Learning on Multimodal Analysis of Electronic Health Records [15.392566551086782]
We propose a novel feature embedding generative model and design a multimodal contrastive loss to obtain the multimodal EHR feature representation.
Our theoretical analysis demonstrates the effectiveness of multimodal learning compared to single-modality learning.
This connection paves the way for a privacy-preserving algorithm tailored for multimodal EHR feature representation learning.
arXiv Detail & Related papers (2024-03-22T03:01:42Z) - Multimodal Fusion of EHR in Structures and Semantics: Integrating Clinical Records and Notes with Hypergraph and LLM [39.25272553560425]
We propose a new framework called MINGLE, which integrates both structures and semantics in EHR effectively.
Our framework uses a two-level infusion strategy to combine medical concept semantics and clinical note semantics into hypergraph neural networks.
Experiment results on two EHR datasets, the public MIMIC-III and private CRADLE, show that MINGLE can effectively improve predictive performance by 11.83% relatively.
arXiv Detail & Related papers (2024-02-19T23:48:40Z) - XAI for In-hospital Mortality Prediction via Multimodal ICU Data [57.73357047856416]
We propose an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data.
We employ multimodal learning in our framework, which can receive heterogeneous inputs from clinical data and make decisions.
Our framework can be easily transferred to other clinical tasks, which facilitates the discovery of crucial factors in healthcare research.
arXiv Detail & Related papers (2023-12-29T14:28:04Z) - Towards Generalist Foundation Model for Radiology by Leveraging
Web-scale 2D&3D Medical Data [66.9359934608229]
This study aims to initiate the development of Radiology Foundation Model, termed as RadFM.
To the best of our knowledge, this is the first large-scale, high-quality, medical visual-language dataset, with both 2D and 3D scans.
We propose a new evaluation benchmark, RadBench, that comprises five tasks, including modality recognition, disease diagnosis, visual question answering, report generation and rationale diagnosis.
arXiv Detail & Related papers (2023-08-04T17:00:38Z) - Competence-based Multimodal Curriculum Learning for Medical Report
Generation [98.10763792453925]
We propose a Competence-based Multimodal Curriculum Learning framework ( CMCL) to alleviate the data bias and make best use of available data.
Specifically, CMCL simulates the learning process of radiologists and optimize the model in a step by step manner.
Experiments on the public IU-Xray and MIMIC-CXR datasets show that CMCL can be incorporated into existing models to improve their performance.
arXiv Detail & Related papers (2022-06-24T08:16:01Z) - MIMO: Mutual Integration of Patient Journey and Medical Ontology for
Healthcare Representation Learning [49.57261599776167]
We propose an end-to-end robust Transformer-based solution, Mutual Integration of patient journey and Medical Ontology (MIMO) for healthcare representation learning and predictive analytics.
arXiv Detail & Related papers (2021-07-20T07:04:52Z) - Cross-Modal Information Maximization for Medical Imaging: CMIM [62.28852442561818]
In hospitals, data are siloed to specific information systems that make the same information available under different modalities.
This offers unique opportunities to obtain and use at train-time those multiple views of the same information that might not always be available at test-time.
We propose an innovative framework that makes the most of available data by learning good representations of a multi-modal input that are resilient to modality dropping at test-time.
arXiv Detail & Related papers (2020-10-20T20:05:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.