Related papers: Exploiting the Asymmetric Uncertainty Structure of Pre-trained VLMs on the Unit Hypersphere

Exploiting the Asymmetric Uncertainty Structure of Pre-trained VLMs on the Unit Hypersphere

URL: http://arxiv.org/abs/2505.11029v1
Date: Fri, 16 May 2025 09:24:29 GMT
Title: Exploiting the Asymmetric Uncertainty Structure of Pre-trained VLMs on the Unit Hypersphere
Authors: Li Ju, Max Andersson, Stina Fredriksson, Edward Glöckner, Andreas Hellander, Ekta Vats, Prashant Singh,
Abstract summary: We propose AsymVLM to build probabilistic embeddings from pre-trained vision-language models on the unit hypersphere, enabling uncertainty quantification.<n>We validate the effectiveness of the probabilistic embeddings on established benchmarks, and present comprehensive ablation studies demonstrating the inherent nature of asymmetry in the uncertainty structure of textual and visual data.
Score: 0.301138495170623
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Vision-language models (VLMs) as foundation models have significantly enhanced performance across a wide range of visual and textual tasks, without requiring large-scale training from scratch for downstream tasks. However, these deterministic VLMs fail to capture the inherent ambiguity and uncertainty in natural language and visual data. Recent probabilistic post-hoc adaptation methods address this by mapping deterministic embeddings onto probability distributions; however, existing approaches do not account for the asymmetric uncertainty structure of the modalities, and the constraint that meaningful deterministic embeddings reside on a unit hypersphere, potentially leading to suboptimal performance. In this paper, we address the asymmetric uncertainty structure inherent in textual and visual data, and propose AsymVLM to build probabilistic embeddings from pre-trained VLMs on the unit hypersphere, enabling uncertainty quantification. We validate the effectiveness of the probabilistic embeddings on established benchmarks, and present comprehensive ablation studies demonstrating the inherent nature of asymmetry in the uncertainty structure of textual and visual data.

Related papers

Exploring the Potential for Large Language Models to Demonstrate Rational Probabilistic Beliefs [12.489784979345654]
We show that current versions of large language models (LLMs) lack the ability to provide rational and coherent representations of probabilistic beliefs.<n>We apply well-established techniques for uncertainty quantification to measure the ability of LLM's to adhere to fundamental properties of probabilistic reasoning.
arXiv Detail & Related papers (2025-04-18T11:50:30Z)
Post-hoc Probabilistic Vision-Language Models [51.12284891724463]
Vision-language models (VLMs) have found remarkable success in classification, retrieval, and generative tasks.<n>We propose post-hoc uncertainty estimation in VLMs that does not require additional training.<n>Our results show promise for safety-critical applications of large-scale models.
arXiv Detail & Related papers (2024-12-08T18:16:13Z)
Predictive Uncertainty Quantification for Bird's Eye View Segmentation: A Benchmark and Novel Loss Function [10.193504550494486]
This paper introduces a benchmark for predictive uncertainty quantification in Bird's Eye View (BEV) segmentation.<n>Our study focuses on the effectiveness of quantified uncertainty in detecting misclassified and out-of-distribution pixels.<n>We propose a novel loss function, Uncertainty-Focal-Cross-Entropy (UFCE), specifically designed for highly imbalanced data.
arXiv Detail & Related papers (2024-05-31T16:32:46Z)
Probabilistic Contrastive Learning with Explicit Concentration on the Hypersphere [3.572499139455308]
This paper introduces a new perspective on incorporating uncertainty into contrastive learning by embedding representations within a spherical space. We leverage the concentration parameter, kappa, as a direct, interpretable measure to quantify uncertainty explicitly.
arXiv Detail & Related papers (2024-05-26T07:08:13Z)
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability. In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling. Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z)
Quantification of Predictive Uncertainty via Inference-Time Sampling [57.749601811982096]
We propose a post-hoc sampling strategy for estimating predictive uncertainty accounting for data ambiguity. The method can generate different plausible outputs for a given input and does not assume parametric forms of predictive distributions.
arXiv Detail & Related papers (2023-08-03T12:43:21Z)
Measuring and Modeling Uncertainty Degree for Monocular Depth Estimation [50.920911532133154]
The intrinsic ill-posedness and ordinal-sensitive nature of monocular depth estimation (MDE) models pose major challenges to the estimation of uncertainty degree. We propose to model the uncertainty of MDE models from the perspective of the inherent probability distributions. By simply introducing additional training regularization terms, our model, with surprisingly simple formations and without requiring extra modules or multiple inferences, can provide uncertainty estimations with state-of-the-art reliability.
arXiv Detail & Related papers (2023-07-19T12:11:15Z)
Probabilistic computation and uncertainty quantification with emerging covariance [11.79594512851008]
Building robust, interpretable, and secure AI system requires quantifying and representing uncertainty under a probabilistic perspective. Probability computation presents significant challenges for most conventional artificial neural network.
arXiv Detail & Related papers (2023-05-30T17:55:29Z)
Integrating Uncertainty into Neural Network-based Speech Enhancement [27.868722093985006]
Supervised masking approaches in the time-frequency domain aim to employ deep neural networks to estimate a multiplicative mask to extract clean speech. This leads to a single estimate for each input without any guarantees or measures of reliability. We study the benefits of modeling uncertainty in clean speech estimation.
arXiv Detail & Related papers (2023-05-15T15:55:12Z)
Non-Linear Spectral Dimensionality Reduction Under Uncertainty [107.01839211235583]
We propose a new dimensionality reduction framework, called NGEU, which leverages uncertainty information and directly extends several traditional approaches. We show that the proposed NGEU formulation exhibits a global closed-form solution, and we analyze, based on the Rademacher complexity, how the underlying uncertainties theoretically affect the generalization ability of the framework.
arXiv Detail & Related papers (2022-02-09T19:01:33Z)
NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural Networks [151.03112356092575]
We show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution. We demonstrate the strong performance of the method in uncertainty estimation tasks on a variety of real-world image datasets.
arXiv Detail & Related papers (2022-02-07T12:30:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.