Embracing Aleatoric Uncertainty in Medical Multimodal Learning with Missing Modalities
- URL: http://arxiv.org/abs/2601.21950v1
- Date: Thu, 29 Jan 2026 16:31:48 GMT
- Title: Embracing Aleatoric Uncertainty in Medical Multimodal Learning with Missing Modalities
- Authors: Linxiao Gong, Yang Liu, Lianlong Sun, Yulai Bi, Jing Liu, Xiaoguang Zhu,
- Abstract summary: We propose the Aleatoric Uncertainty Modeling (AUM) that explicitly quantifies unimodal aleatoric uncertainty to address missing modalities.<n>Our framework achieves an improvement of 2.26% AUC-ROC on MIMIC-IV mortality prediction and 2.17% gain on eICU.
- Score: 9.064442992807264
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Medical multimodal learning faces significant challenges with missing modalities prevalent in clinical practice. Existing approaches assume equal contribution of modality and random missing patterns, neglecting inherent uncertainty in medical data acquisition. In this regard, we propose the Aleatoric Uncertainty Modeling (AUM) that explicitly quantifies unimodal aleatoric uncertainty to address missing modalities. Specifically, AUM models each unimodal representation as a multivariate Gaussian distribution to capture aleatoric uncertainty and enable principled modality reliability quantification. To adaptively aggregate captured information, we develop a dynamic message-passing mechanism within a bipartite patient-modality graph using uncertainty-aware aggregation mechanism. Through this process, missing modalities are naturally accommodated, while more reliable information from available modalities is dynamically emphasized to guide representation generation. Our AUM framework achieves an improvement of 2.26% AUC-ROC on MIMIC-IV mortality prediction and 2.17% gain on eICU, outperforming existing state-of-the-art approaches.
Related papers
- A Contrastive Variational AutoEncoder for NSCLC Survival Prediction with Missing Modalities [41.8469011437549]
Predicting survival outcomes for non-small cell lung cancer (NSCLC) patients is challenging due to the different individual prognostic features.<n>State-of-the-art models rely on available data to create patient-level representations or use generative models to infer missing modalities.<n>We propose a Multimodal Contrastive Variational AutoEncoder (MCVAE) to address this issue.
arXiv Detail & Related papers (2026-02-19T14:29:34Z) - impuTMAE: Multi-modal Transformer with Masked Pre-training for Missing Modalities Imputation in Cancer Survival Prediction [75.43342771863837]
We introduce impuTMAE, a novel transformer-based end-to-end approach with an efficient multimodal pre-training strategy.<n>It learns inter- and intra-modal interactions while simultaneously imputing missing modalities by reconstructing masked patches.<n>Our model is pre-trained on heterogeneous, incomplete data and fine-tuned for glioma survival prediction using TCGA-GBM/LGG and BraTS datasets.
arXiv Detail & Related papers (2025-08-08T10:01:16Z) - Incomplete Modality Disentangled Representation for Ophthalmic Disease Grading and Diagnosis [16.95583564875497]
We propose an Incomplete Modality Disentangled Representation (IMDR) strategy to disentangle features into explicit independent modal-common and modal-specific features.<n> Experiments on four multimodal datasets demonstrate that the proposed IMDR outperforms the state-of-the-art methods significantly.
arXiv Detail & Related papers (2025-02-17T12:10:35Z) - Reducing Aleatoric and Epistemic Uncertainty through Multi-modal Data Acquisition [5.468547489755107]
This paper introduces an innovative data acquisition framework where uncertainty disentanglement leads to actionable decisions.<n>The main hypothesis is that aleatoric uncertainty decreases as the number of modalities increases.<n>We provide proof-of-concept implementations on two multi-modal datasets to showcase our data acquisition framework.
arXiv Detail & Related papers (2025-01-30T11:05:59Z) - Continually Evolved Multimodal Foundation Models for Cancer Prognosis [50.43145292874533]
Cancer prognosis is a critical task that involves predicting patient outcomes and survival rates.<n>Previous studies have integrated diverse data modalities, such as clinical notes, medical images, and genomic data, leveraging their complementary information.<n>Existing approaches face two major limitations. First, they struggle to incorporate newly arrived data with varying distributions into training, such as patient records from different hospitals.<n>Second, most multimodal integration methods rely on simplistic concatenation or task-specific pipelines, which fail to capture the complex interdependencies across modalities.
arXiv Detail & Related papers (2025-01-30T06:49:57Z) - Multi-modal MRI Translation via Evidential Regression and Distribution Calibration [29.56726531611307]
We propose a novel framework that reformulates multi-modal MRI translation as a multi-modal evidential regression problem with distribution calibration.<n>Our approach incorporates two key components: 1) an evidential regression module that estimates uncertainties from different source modalities and an explicit distribution mixture strategy for transparent multi-modal fusion, and 2) a distribution calibration mechanism that adapts to source-target mapping shifts.
arXiv Detail & Related papers (2024-07-10T05:17:01Z) - Confidence-aware multi-modality learning for eye disease screening [58.861421804458395]
We propose a novel multi-modality evidential fusion pipeline for eye disease screening.
It provides a measure of confidence for each modality and elegantly integrates the multi-modality information.
Experimental results on both public and internal datasets demonstrate that our model excels in robustness.
arXiv Detail & Related papers (2024-05-28T13:27:30Z) - Cross-Attention is Not Enough: Incongruity-Aware Dynamic Hierarchical
Fusion for Multimodal Affect Recognition [69.32305810128994]
Incongruity between modalities poses a challenge for multimodal fusion, especially in affect recognition.
We propose the Hierarchical Crossmodal Transformer with Dynamic Modality Gating (HCT-DMG), a lightweight incongruity-aware model.
HCT-DMG: 1) outperforms previous multimodal models with a reduced size of approximately 0.8M parameters; 2) recognizes hard samples where incongruity makes affect recognition difficult; 3) mitigates the incongruity at the latent level in crossmodal attention.
arXiv Detail & Related papers (2023-05-23T01:24:15Z) - Reliable Multimodality Eye Disease Screening via Mixture of Student's t
Distributions [49.4545260500952]
We introduce a novel multimodality evidential fusion pipeline for eye disease screening, EyeMoSt.
Our model estimates both local uncertainty for unimodality and global uncertainty for the fusion modality to produce reliable classification results.
Our experimental findings on both public and in-house datasets show that our model is more reliable than current methods.
arXiv Detail & Related papers (2023-03-17T06:18:16Z) - Exploiting modality-invariant feature for robust multimodal emotion
recognition with missing modalities [76.08541852988536]
We propose to use invariant features for a missing modality imagination network (IF-MMIN)
We show that the proposed model outperforms all baselines and invariantly improves the overall emotion recognition performance under uncertain missing-modality conditions.
arXiv Detail & Related papers (2022-10-27T12:16:25Z) - Improving Trustworthiness of AI Disease Severity Rating in Medical
Imaging with Ordinal Conformal Prediction Sets [0.7734726150561088]
A lack of statistically rigorous uncertainty quantification is a significant factor undermining trust in AI results.
Recent developments in distribution-free uncertainty quantification present practical solutions for these issues.
We demonstrate a technique for forming ordinal prediction sets that are guaranteed to contain the correct stenosis severity.
arXiv Detail & Related papers (2022-07-05T18:01:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.