Related papers: Characterizing the Predictive Impact of Modalities with Supervised Latent-Variable Modeling

Characterizing the Predictive Impact of Modalities with Supervised Latent-Variable Modeling

URL: http://arxiv.org/abs/2602.16979v1
Date: Thu, 19 Feb 2026 00:37:28 GMT
Title: Characterizing the Predictive Impact of Modalities with Supervised Latent-Variable Modeling
Authors: Divyam Madaan, Sumit Chopra, Kyunghyun Cho,
Abstract summary: PRIMO is a supervised latent-variable imputation model that quantifies the predictive impact of missing modalities.<n> PRIMO enables the use of all available training examples, whether modalities are complete or partial.<n>We evaluate PRIMO on a synthetic XOR dataset, Audio-Vision MNIST, and MIMIC-III for mortality and ICD-9 prediction.
Score: 43.81891375838308
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Despite the recent success of Multimodal Large Language Models (MLLMs), existing approaches predominantly assume the availability of multiple modalities during training and inference. In practice, multimodal data is often incomplete because modalities may be missing, collected asynchronously, or available only for a subset of examples. In this work, we propose PRIMO, a supervised latent-variable imputation model that quantifies the predictive impact of any missing modality within the multimodal learning setting. PRIMO enables the use of all available training examples, whether modalities are complete or partial. Specifically, it models the missing modality through a latent variable that captures its relationship with the observed modality in the context of prediction. During inference, we draw many samples from the learned distribution over the missing modality to both obtain the marginal predictive distribution (for the purpose of prediction) and analyze the impact of the missing modalities on the prediction for each instance. We evaluate PRIMO on a synthetic XOR dataset, Audio-Vision MNIST, and MIMIC-III for mortality and ICD-9 prediction. Across all datasets, PRIMO obtains performance comparable to unimodal baselines when a modality is fully missing and to multimodal baselines when all modalities are available. PRIMO quantifies the predictive impact of a modality at the instance level using a variance-based metric computed from predictions across latent completions. We visually demonstrate how varying completions of the missing modality result in a set of plausible labels.

Related papers

Multi-modal Bayesian Neural Network Surrogates with Conjugate Last-Layer Estimation [0.30586855806896046]
We develop two multi-modal neural network surrogate models and leverage conditionally conjugate distributions in the last layer to estimate model parameters.<n>We demonstrate improved prediction accuracy and uncertainty compared to uni-modal surrogate models for both scalar and time series data.
arXiv Detail & Related papers (2025-09-26T00:13:57Z)
Are you SURE? Enhancing Multimodal Pretraining with Missing Modalities through Uncertainty Estimation [12.459901557580052]
We present SURE, a novel framework that extends the capabilities of pretrained multimodal models by introducing latent space reconstruction and uncertainty estimation.<n>We show that SURE consistently achieves state-of-the-art performance, ensuring robust predictions even in the presence of incomplete data.
arXiv Detail & Related papers (2025-04-18T05:07:20Z)
MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models. We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z)
Missing Modality Prediction for Unpaired Multimodal Learning via Joint Embedding of Unimodal Models [6.610033827647869]
In real-world scenarios, consistently acquiring complete multimodal data presents significant challenges. This often leads to the issue of missing modalities, where data for certain modalities are absent. We propose a novel framework integrating parameter-efficient fine-tuning of unimodal pretrained models with a self-supervised joint-embedding learning method.
arXiv Detail & Related papers (2024-07-17T14:44:25Z)
It's All in the Mix: Wasserstein Classification and Regression with Mixed Features [2.2685251390114565]
We develop and analyze distributionally robust prediction models that faithfully account for the presence of discrete features.<n>We demonstrate that our models can significantly outperform existing methods that are agnostic to the presence of discrete features.
arXiv Detail & Related papers (2023-12-19T15:15:52Z)
Ensemble Modeling for Multimodal Visual Action Recognition [50.38638300332429]
We propose an ensemble modeling approach for multimodal action recognition. We independently train individual modality models using a variant of focal loss tailored to handle the long-tailed distribution of the MECCANO [21] dataset.
arXiv Detail & Related papers (2023-08-10T08:43:20Z)
Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma Distributions [91.63716984911278]
We introduce a novel Mixture of Normal-Inverse Gamma distributions (MoNIG) algorithm, which efficiently estimates uncertainty in principle for adaptive integration of different modalities and produces a trustworthy regression result. Experimental results on both synthetic and different real-world data demonstrate the effectiveness and trustworthiness of our method on various multimodal regression tasks.
arXiv Detail & Related papers (2021-11-11T14:28:12Z)
Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets. Part of the challenge of learning robust models lies in the influence of unobserved confounders. We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z)
Ambiguity in Sequential Data: Predicting Uncertain Futures with Recurrent Models [110.82452096672182]
We propose an extension of the Multiple Hypothesis Prediction (MHP) model to handle ambiguous predictions with sequential data. We also introduce a novel metric for ambiguous problems, which is better suited to account for uncertainties.
arXiv Detail & Related papers (2020-03-10T09:15:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.