Automated Fusion of Multimodal Electronic Health Records for Better
Medical Predictions
- URL: http://arxiv.org/abs/2401.11252v1
- Date: Sat, 20 Jan 2024 15:14:14 GMT
- Title: Automated Fusion of Multimodal Electronic Health Records for Better
Medical Predictions
- Authors: Suhan Cui, Jiaqi Wang, Yuan Zhong, Han Liu, Ting Wang, Fenglong Ma
- Abstract summary: We propose a novel neural architecture search (NAS) framework named AutoFM, which can automatically search for the optimal model architectures for encoding diverse input modalities and fusion strategies.
We conduct thorough experiments on real-world multi-modal EHR data and prediction tasks, and the results demonstrate that our framework achieves significant performance improvement over existing state-of-the-art methods.
- Score: 48.0590120095748
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The widespread adoption of Electronic Health Record (EHR) systems in
healthcare institutes has generated vast amounts of medical data, offering
significant opportunities for improving healthcare services through deep
learning techniques. However, the complex and diverse modalities and feature
structures in real-world EHR data pose great challenges for deep learning model
design. To address the multi-modality challenge in EHR data, current approaches
primarily rely on hand-crafted model architectures based on intuition and
empirical experiences, leading to sub-optimal model architectures and limited
performance. Therefore, to automate the process of model design for mining EHR
data, we propose a novel neural architecture search (NAS) framework named
AutoFM, which can automatically search for the optimal model architectures for
encoding diverse input modalities and fusion strategies. We conduct thorough
experiments on real-world multi-modal EHR data and prediction tasks, and the
results demonstrate that our framework not only achieves significant
performance improvement over existing state-of-the-art methods but also
discovers meaningful network architectures effectively.
Related papers
- Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models [69.06149482021071]
We propose a novel EHR data generation model called EHRPD.
It is a diffusion-based model designed to predict the next visit based on the current one while also incorporating time interval estimation.
We conduct experiments on two public datasets and evaluate EHRPD from fidelity, privacy, and utility perspectives.
arXiv Detail & Related papers (2024-06-20T02:20:23Z) - Rethinking Model Prototyping through the MedMNIST+ Dataset Collection [0.11999555634662634]
This work presents a benchmark for the MedMNIST+ database to diversify the evaluation landscape.
We conduct a thorough analysis of common convolutional neural networks (CNNs) and Transformer-based architectures, for medical image classification.
Our findings suggest that computationally efficient training schemes and modern foundation models hold promise in bridging the gap between expensive end-to-end training and more resource-refined approaches.
arXiv Detail & Related papers (2024-04-24T10:19:25Z) - MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications.
Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders.
We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - Automated Multi-Task Learning for Joint Disease Prediction on Electronic Health Records [4.159498069487535]
We propose an automated approach named AutoDP, which can search for the optimal configuration of task grouping and architectures simultaneously.
It achieves significant performance improvements over both hand-crafted and automated state-of-the-art methods, also maintains a feasible search cost at the same time.
arXiv Detail & Related papers (2024-03-06T22:32:48Z) - HEALNet: Multimodal Fusion for Heterogeneous Biomedical Data [10.774128925670183]
This paper presents the Hybrid Early-fusion Attention Learning Network (HEALNet), a flexible multimodal fusion architecture.
We conduct multimodal survival analysis on Whole Slide Images and Multi-omic data on four cancer datasets from The Cancer Genome Atlas (TCGA)
HEALNet achieves state-of-the-art performance compared to other end-to-end trained fusion models.
arXiv Detail & Related papers (2023-11-15T17:06:26Z) - Extending Process Discovery with Model Complexity Optimization and
Cyclic States Identification: Application to Healthcare Processes [62.997667081978825]
The paper presents an approach to process mining providing semi-automatic support to model optimization.
A model simplification approach is proposed, which essentially abstracts the raw model at the desired granularity.
We aim to demonstrate the capabilities of the technological solution using three datasets from different applications in the healthcare domain.
arXiv Detail & Related papers (2022-06-10T16:20:59Z) - DIME: Fine-grained Interpretations of Multimodal Models via Disentangled
Local Explanations [119.1953397679783]
We focus on advancing the state-of-the-art in interpreting multimodal models.
Our proposed approach, DIME, enables accurate and fine-grained analysis of multimodal models.
arXiv Detail & Related papers (2022-03-03T20:52:47Z) - Knowledge-Guided Dynamic Systems Modeling: A Case Study on Modeling
River Water Quality [8.110949636804774]
Modeling real-world phenomena is a focus of many science and engineering efforts, such as ecological modeling and financial forecasting.
Building an accurate model for complex and dynamic systems improves understanding of underlying processes and leads to resource efficiency.
At the opposite extreme, data-driven modeling learns a model directly from data, requiring extensive data and potentially generating overfitting.
We focus on an intermediate approach, model revision, in which prior knowledge and data are combined to achieve the best of both worlds.
arXiv Detail & Related papers (2021-03-01T06:31:38Z) - MUFASA: Multimodal Fusion Architecture Search for Electronic Health
Records [18.42914458055976]
We extend state-of-the-art neural architecture search (NAS) methods and propose MUltimodal Fusion Architecture SeArch (MUFASA)
We demonstrate empirically that our MUFASA method outperforms established unimodal NAS on public EHR data with comparable costs.
Compared with these baselines on CCS diagnosis code prediction, our discovered models improve top-5 recall from 0.88 to 0.91 and demonstrate the ability to generalize to other EHR tasks.
arXiv Detail & Related papers (2021-02-03T23:48:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.