Missing-modality Enabled Multi-modal Fusion Architecture for Medical
Data
- URL: http://arxiv.org/abs/2309.15529v1
- Date: Wed, 27 Sep 2023 09:46:07 GMT
- Title: Missing-modality Enabled Multi-modal Fusion Architecture for Medical
Data
- Authors: Muyu Wang, Shiyu Fan, Yichen Li, Hui Chen
- Abstract summary: Fusing multi-modal data can improve the performance of deep learning models.
Missing modalities are common for medical data due to patients' specificity.
This study developed an efficient multi-modal fusion architecture for medical data that was robust to missing modalities.
- Score: 8.472576865966744
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fusing multi-modal data can improve the performance of deep learning models.
However, missing modalities are common for medical data due to patients'
specificity, which is detrimental to the performance of multi-modal models in
applications. Therefore, it is critical to adapt the models to missing
modalities. This study aimed to develop an efficient multi-modal fusion
architecture for medical data that was robust to missing modalities and further
improved the performance on disease diagnosis.X-ray chest radiographs for the
image modality, radiology reports for the text modality, and structured value
data for the tabular data modality were fused in this study. Each modality pair
was fused with a Transformer-based bi-modal fusion module, and the three
bi-modal fusion modules were then combined into a tri-modal fusion framework.
Additionally, multivariate loss functions were introduced into the training
process to improve model's robustness to missing modalities in the inference
process. Finally, we designed comparison and ablation experiments for
validating the effectiveness of the fusion, the robustness to missing
modalities and the enhancements from each key component. Experiments were
conducted on MIMIC-IV, MIMIC-CXR with the 14-label disease diagnosis task.
Areas under the receiver operating characteristic curve (AUROC), the area under
the precision-recall curve (AUPRC) were used to evaluate models' performance.
The experimental results demonstrated that our proposed multi-modal fusion
architecture effectively fused three modalities and showed strong robustness to
missing modalities. This method is hopeful to be scaled to more modalities to
enhance the clinical practicality of the model.
Related papers
- Towards Precision Healthcare: Robust Fusion of Time Series and Image Data [8.579651833717763]
We introduce a new method that uses two separate encoders, one for each type of data, allowing the model to understand complex patterns in both visual and time-based information.
We also deal with imbalanced datasets and use an uncertainty loss function, yielding improved results.
Our experiments show that our method is effective in improving multimodal deep learning for clinical applications.
arXiv Detail & Related papers (2024-05-24T11:18:13Z) - DF-DM: A foundational process model for multimodal data fusion in the artificial intelligence era [3.2549142515720044]
This paper introduces a new process model for multimodal Data Fusion for Data Mining.
Our model aims to decrease computational costs, complexity, and bias while improving efficiency and reliability.
We demonstrate its efficacy through three use cases: predicting diabetic retinopathy using retinal images and patient metadata, domestic violence prediction employing satellite imagery, internet, and census data, and identifying clinical and demographic features from radiography images and clinical notes.
arXiv Detail & Related papers (2024-04-18T15:52:42Z) - DrFuse: Learning Disentangled Representation for Clinical Multi-Modal
Fusion with Missing Modality and Modal Inconsistency [18.291267748113142]
We propose DrFuse to achieve effective clinical multi-modal fusion.
We address the missing modality issue by disentangling the features shared across modalities and those unique within each modality.
We validate the proposed method using real-world large-scale datasets, MIMIC-IV and MIMIC-CXR.
arXiv Detail & Related papers (2024-03-10T12:41:34Z) - Improving Discriminative Multi-Modal Learning with Large-Scale
Pre-Trained Models [51.5543321122664]
This paper investigates how to better leverage large-scale pre-trained uni-modal models to enhance discriminative multi-modal learning.
We introduce Multi-Modal Low-Rank Adaptation learning (MMLoRA)
arXiv Detail & Related papers (2023-10-08T15:01:54Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Cross-Attention is Not Enough: Incongruity-Aware Dynamic Hierarchical
Fusion for Multimodal Affect Recognition [69.32305810128994]
Incongruity between modalities poses a challenge for multimodal fusion, especially in affect recognition.
We propose the Hierarchical Crossmodal Transformer with Dynamic Modality Gating (HCT-DMG), a lightweight incongruity-aware model.
HCT-DMG: 1) outperforms previous multimodal models with a reduced size of approximately 0.8M parameters; 2) recognizes hard samples where incongruity makes affect recognition difficult; 3) mitigates the incongruity at the latent level in crossmodal attention.
arXiv Detail & Related papers (2023-05-23T01:24:15Z) - Feature robustness and sex differences in medical imaging: a case study
in MRI-based Alzheimer's disease detection [1.7616042687330637]
We compare two classification schemes on the ADNI MRI dataset.
We do not find a strong dependence of model performance for male and female test subjects on the sex composition of the training dataset.
arXiv Detail & Related papers (2022-04-04T17:37:54Z) - On Modality Bias Recognition and Reduction [70.69194431713825]
We study the modality bias problem in the context of multi-modal classification.
We propose a plug-and-play loss function method, whereby the feature space for each label is adaptively learned.
Our method yields remarkable performance improvements compared with the baselines.
arXiv Detail & Related papers (2022-02-25T13:47:09Z) - MMLN: Leveraging Domain Knowledge for Multimodal Diagnosis [10.133715767542386]
We propose a knowledge-driven and data-driven framework for lung disease diagnosis.
We formulate diagnosis rules according to authoritative clinical medicine guidelines and learn the weights of rules from text data.
A multimodal fusion consisting of text and image data is designed to infer the marginal probability of lung disease.
arXiv Detail & Related papers (2022-02-09T04:12:30Z) - Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal
Sentiment Analysis [96.46952672172021]
Bi-Bimodal Fusion Network (BBFN) is a novel end-to-end network that performs fusion on pairwise modality representations.
Model takes two bimodal pairs as input due to known information imbalance among modalities.
arXiv Detail & Related papers (2021-07-28T23:33:42Z) - Robust Multimodal Brain Tumor Segmentation via Feature Disentanglement
and Gated Fusion [71.87627318863612]
We propose a novel multimodal segmentation framework which is robust to the absence of imaging modalities.
Our network uses feature disentanglement to decompose the input modalities into the modality-specific appearance code.
We validate our method on the important yet challenging multimodal brain tumor segmentation task with the BRATS challenge dataset.
arXiv Detail & Related papers (2020-02-22T14:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.