Related papers: Improving Semantic Uncertainty Quantification in LVLMs with Semantic Gaussian Processes

Improving Semantic Uncertainty Quantification in LVLMs with Semantic Gaussian Processes

URL: http://arxiv.org/abs/2512.14177v1
Date: Tue, 16 Dec 2025 08:15:24 GMT
Title: Improving Semantic Uncertainty Quantification in LVLMs with Semantic Gaussian Processes
Authors: Joseph Hoche, Andrei Bursuc, David Brellmann, Gilles Louppe, Pavel Izmailov, Angela Yao, Gianni Franchi,
Abstract summary: We propose a Bayesian framework that quantifies semantic uncertainty by analyzing the geometric structure of answer embeddings.<n>S GPU maps generated answers into a dense semantic space, computes the Gram matrix of their semantic embeddings, and summarizes their semantic configuration.<n>We show that S GPU transfers across models and modalities, indicating that its spectral representation captures general patterns of semantic uncertainty.
Score: 60.75226150503949
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Vision-Language Models (LVLMs) often produce plausible but unreliable outputs, making robust uncertainty estimation essential. Recent work on semantic uncertainty estimates relies on external models to cluster multiple sampled responses and measure their semantic consistency. However, these clustering methods are often fragile, highly sensitive to minor phrasing variations, and can incorrectly group or separate semantically similar answers, leading to unreliable uncertainty estimates. We propose Semantic Gaussian Process Uncertainty (SGPU), a Bayesian framework that quantifies semantic uncertainty by analyzing the geometric structure of answer embeddings, avoiding brittle clustering. SGPU maps generated answers into a dense semantic space, computes the Gram matrix of their embeddings, and summarizes their semantic configuration via the eigenspectrum. This spectral representation is then fed into a Gaussian Process Classifier that learns to map patterns of semantic consistency to predictive uncertainty, and that can be applied in both black-box and white-box settings. Across six LLMs and LVLMs on eight datasets spanning VQA, image classification, and textual QA, SGPU consistently achieves state-of-the-art calibration (ECE) and discriminative (AUROC, AUARC) performance. We further show that SGPU transfers across models and modalities, indicating that its spectral representation captures general patterns of semantic uncertainty.

Related papers

ReFRAME or Remain: Unsupervised Lexical Semantic Change Detection with Frame Semantics [1.1340133299604382]
We develop a new method for detecting semantic change based on frame semantics.<n>We show that this method is effective for detecting semantic change and can even outperform many distributional semantic models.
arXiv Detail & Related papers (2026-02-04T13:00:49Z)
Transparent Semantic Change Detection with Dependency-Based Profiles [1.1340133299604382]
We investigate an alternative method which relies purely on dependency co-occurrence patterns of words.<n>We demonstrate that it is effective for semantic change detection and even outperforms a number of distributional semantic models.
arXiv Detail & Related papers (2026-01-06T10:25:36Z)
EvidMTL: Evidential Multi-Task Learning for Uncertainty-Aware Semantic Surface Mapping from Monocular RGB Images [7.069718718698565]
Existing mapping methods often suffer from overconfident semantic predictions, and sparse and noisy depth sensing.<n>We introduce EvidMTL, a multi-task learning framework that uses evidential heads for depth estimation and semantic segmentation.<n>We present EvidKimera, an uncertainty-aware semantic surface mapping framework, which uses evidential depth and semantics prediction for improved 3D metric-semantic consistency.
arXiv Detail & Related papers (2025-03-06T13:56:48Z)
Post-hoc Probabilistic Vision-Language Models [54.05237186168399]
Vision-language models (VLMs) have found remarkable success in classification, retrieval, and generative tasks.<n>We propose post-hoc uncertainty estimation in VLMs that does not require additional training.<n>Our results show promise for safety-critical applications of large-scale models.
arXiv Detail & Related papers (2024-12-08T18:16:13Z)
LatentBKI: Open-Dictionary Continuous Mapping in Visual-Language Latent Spaces with Quantifiable Uncertainty [6.986230616834552]
This paper introduces a novel probabilistic mapping algorithm, LatentBKI, which enables open-vocabulary mapping with quantifiable uncertainty.<n>LatentBKI is evaluated against similar explicit semantic mapping and VL mapping frameworks on the popular Matterport3D and Semantic KITTI datasets.<n>Real-world experiments demonstrate applicability to challenging indoor environments.
arXiv Detail & Related papers (2024-10-15T17:02:32Z)
Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode. We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z)
Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities [79.9629927171974]
Uncertainty in Large Language Models (LLMs) is crucial for applications where safety and reliability are important. We propose Kernel Language Entropy (KLE), a novel method for uncertainty estimation in white- and black-box LLMs.
arXiv Detail & Related papers (2024-05-30T12:42:05Z)
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability. In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling. Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z)
Spatially Varying Label Smoothing: Capturing Uncertainty from Expert Annotations [19.700271444378618]
The task of image segmentation is inherently noisy due to ambiguities regarding the exact location of boundaries between anatomical structures. We argue that this information can be extracted from the expert annotations at no extra cost, and it can lead to improved calibration between soft probabilistic predictions and the underlying uncertainty. We built upon label smoothing (LS) where a network is trained on 'blurred' versions of the ground truth labels which has been shown to be effective for calibrating output predictions.
arXiv Detail & Related papers (2021-04-12T19:35:51Z)
Deep Clustering by Semantic Contrastive Learning [67.28140787010447]
We introduce a novel variant called Semantic Contrastive Learning (SCL) It explores the characteristics of both conventional contrastive learning and deep clustering. It can amplify the strengths of contrastive learning and deep clustering in a unified approach.
arXiv Detail & Related papers (2021-03-03T20:20:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.