Are you SURE? Enhancing Multimodal Pretraining with Missing Modalities through Uncertainty Estimation
- URL: http://arxiv.org/abs/2504.13465v1
- Date: Fri, 18 Apr 2025 05:07:20 GMT
- Title: Are you SURE? Enhancing Multimodal Pretraining with Missing Modalities through Uncertainty Estimation
- Authors: Duy A. Nguyen, Quan Huu Do, Khoa D. Doan, Minh N. Do,
- Abstract summary: We present SURE, a novel framework that extends the capabilities of pretrained multimodal models by introducing latent space reconstruction and uncertainty estimation.<n>We show that SURE consistently achieves state-of-the-art performance, ensuring robust predictions even in the presence of incomplete data.
- Score: 12.459901557580052
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multimodal learning has demonstrated incredible successes by integrating diverse data sources, yet it often relies on the availability of all modalities - an assumption that rarely holds in real-world applications. Pretrained multimodal models, while effective, struggle when confronted with small-scale and incomplete datasets (i.e., missing modalities), limiting their practical applicability. Previous studies on reconstructing missing modalities have overlooked the reconstruction's potential unreliability, which could compromise the quality of the final outputs. We present SURE (Scalable Uncertainty and Reconstruction Estimation), a novel framework that extends the capabilities of pretrained multimodal models by introducing latent space reconstruction and uncertainty estimation for both reconstructed modalities and downstream tasks. Our method is architecture-agnostic, reconstructs missing modalities, and delivers reliable uncertainty estimates, improving both interpretability and performance. SURE introduces a unique Pearson Correlation-based loss and applies statistical error propagation in deep networks for the first time, allowing precise quantification of uncertainties from missing data and model predictions. Extensive experiments across tasks such as sentiment analysis, genre classification, and action recognition show that SURE consistently achieves state-of-the-art performance, ensuring robust predictions even in the presence of incomplete data.
Related papers
- Predictive Multiplicity in Survival Models: A Method for Quantifying Model Uncertainty in Predictive Maintenance Applications [0.0]
We frame predictive multiplicity as a critical concern in survival-based models.<n>We introduce formal measures -- ambiguity, discrepancy, and obscurity -- to quantify it.<n>This is particularly relevant for downstream tasks such as maintenance scheduling.
arXiv Detail & Related papers (2025-04-16T15:04:00Z) - Global Convergence of Continual Learning on Non-IID Data [51.99584235667152]
We provide a general and comprehensive theoretical analysis for continual learning of regression models.<n>We establish the almost sure convergence results of continual learning under a general data condition for the first time.
arXiv Detail & Related papers (2025-03-24T10:06:07Z) - Reducing Aleatoric and Epistemic Uncertainty through Multi-modal Data Acquisition [5.468547489755107]
This paper introduces an innovative data acquisition framework where uncertainty disentanglement leads to actionable decisions.<n>The main hypothesis is that aleatoric uncertainty decreases as the number of modalities increases.<n>We provide proof-of-concept implementations on two multi-modal datasets to showcase our data acquisition framework.
arXiv Detail & Related papers (2025-01-30T11:05:59Z) - Uncertainty Quantification via Hölder Divergence for Multi-View Representation Learning [18.076966572539547]
This paper introduces a novel algorithm based on H"older Divergence (HD) to enhance the reliability of multi-view learning.<n>Through the Dempster-Shafer theory, integration of uncertainty from different modalities, thereby generating a comprehensive result.<n>Mathematically, HD proves to better measure the distance'' between real data distribution and predictive distribution of the model.
arXiv Detail & Related papers (2024-10-29T04:29:44Z) - Missing Modality Prediction for Unpaired Multimodal Learning via Joint Embedding of Unimodal Models [6.610033827647869]
In real-world scenarios, consistently acquiring complete multimodal data presents significant challenges.
This often leads to the issue of missing modalities, where data for certain modalities are absent.
We propose a novel framework integrating parameter-efficient fine-tuning of unimodal pretrained models with a self-supervised joint-embedding learning method.
arXiv Detail & Related papers (2024-07-17T14:44:25Z) - Estimating Epistemic and Aleatoric Uncertainty with a Single Model [5.871583927216653]
We introduce a new approach to ensembling, hyper-diffusion models (HyperDM)
HyperDM offers prediction accuracy on par with, and in some cases superior to, multi-model ensembles.
We validate our method on two distinct real-world tasks: x-ray computed tomography reconstruction and weather temperature forecasting.
arXiv Detail & Related papers (2024-02-05T19:39:52Z) - Quantification of Predictive Uncertainty via Inference-Time Sampling [57.749601811982096]
We propose a post-hoc sampling strategy for estimating predictive uncertainty accounting for data ambiguity.
The method can generate different plausible outputs for a given input and does not assume parametric forms of predictive distributions.
arXiv Detail & Related papers (2023-08-03T12:43:21Z) - Modeling Uncertain Feature Representation for Domain Generalization [49.129544670700525]
We show that our method consistently improves the network generalization ability on multiple vision tasks.
Our methods are simple yet effective and can be readily integrated into networks without additional trainable parameters or loss constraints.
arXiv Detail & Related papers (2023-01-16T14:25:02Z) - Uncertainty Modeling for Out-of-Distribution Generalization [56.957731893992495]
We argue that the feature statistics can be properly manipulated to improve the generalization ability of deep learning models.
Common methods often consider the feature statistics as deterministic values measured from the learned features.
We improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training.
arXiv Detail & Related papers (2022-02-08T16:09:12Z) - Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma
Distributions [91.63716984911278]
We introduce a novel Mixture of Normal-Inverse Gamma distributions (MoNIG) algorithm, which efficiently estimates uncertainty in principle for adaptive integration of different modalities and produces a trustworthy regression result.
Experimental results on both synthetic and different real-world data demonstrate the effectiveness and trustworthiness of our method on various multimodal regression tasks.
arXiv Detail & Related papers (2021-11-11T14:28:12Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.