Related papers: Investigating Human-Aligned Large Language Model Uncertainty

Investigating Human-Aligned Large Language Model Uncertainty

URL: http://arxiv.org/abs/2503.12528v1
Date: Sun, 16 Mar 2025 14:45:43 GMT
Title: Investigating Human-Aligned Large Language Model Uncertainty
Authors: Kyle Moore, Jesse Roberts, Daryl Watson, Pamela Wisniewski,
Abstract summary: We investigate a variety of uncertainty measures, in order to identify measures that correlate with human group-level uncertainty.<n>We find that Bayesian measures and a variation on entropy measures, top-k entropy, tend to agree with human behavior as a function of model size.<n>We find that some strong measures decrease in human-similarity with model size, but, by multiple linear regression, we find that combining multiple uncertainty measures provide comparable human-alignment with reduced size-dependency.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent work has sought to quantify large language model uncertainty to facilitate model control and modulate user trust. Previous works focus on measures of uncertainty that are theoretically grounded or reflect the average overt behavior of the model. In this work, we investigate a variety of uncertainty measures, in order to identify measures that correlate with human group-level uncertainty. We find that Bayesian measures and a variation on entropy measures, top-k entropy, tend to agree with human behavior as a function of model size. We find that some strong measures decrease in human-similarity with model size, but, by multiple linear regression, we find that combining multiple uncertainty measures provide comparable human-alignment with reduced size-dependency.

Related papers

Human-Alignment and Calibration of Inference-Time Uncertainty in Large Language Models [0.0]
We evaluate a collection of inference-time uncertainty measures to determine how closely they align with both human group-level uncertainty and traditional notions of model calibration.<n>We find that numerous measures show evidence of strong alignment to human uncertainty, even despite the lack of alignment to human answer preference.
arXiv Detail & Related papers (2025-08-11T17:22:45Z)
On Equivariant Model Selection through the Lens of Uncertainty [49.137341292207]
Equivariant models leverage prior knowledge on symmetries to improve predictive performance, but misspecified architectural constraints can harm it instead.<n>We compare frequentist (via Conformal Prediction), Bayesian (via the marginal likelihood), and calibration-based measures to naive error-based evaluation.<n>We find that uncertainty metrics generally align with predictive performance, but Bayesian model evidence does so inconsistently.
arXiv Detail & Related papers (2025-06-23T13:35:06Z)
Predictive Multiplicity in Survival Models: A Method for Quantifying Model Uncertainty in Predictive Maintenance Applications [0.0]
We frame predictive multiplicity as a critical concern in survival-based models. We introduce formal measures -- ambiguity, discrepancy, and obscurity -- to quantify it. This is particularly relevant for downstream tasks such as maintenance scheduling.
arXiv Detail & Related papers (2025-04-16T15:04:00Z)
Complexity Matters: Effective Dimensionality as a Measure for Adversarial Robustness [0.7366405857677227]
In this work, we investigate the relationship between a model's effective dimensionality and its robustness properties. We run experiments on commercial-scale models that are often used in real-world environments such as YOLO and ResNet. We reveal a near-linear inverse relationship between effective dimensionality and adversarial robustness, that is models with a lower dimensionality exhibit better robustness.
arXiv Detail & Related papers (2024-10-24T09:01:34Z)
Uncertainty-aware Human Mobility Modeling and Anomaly Detection [28.311683535974634]
We study how to model human agents' mobility behavior toward effective anomaly detection. We use GPS data as a sequence stay-point events, each with a set of characterizingtemporal features. Experiments on large expert-simulated datasets with tens of thousands of agents demonstrate the effectiveness of our model.
arXiv Detail & Related papers (2024-10-02T06:57:08Z)
Uncertainty in Language Models: Assessment through Rank-Calibration [65.10149293133846]
Language Models (LMs) have shown promising performance in natural language generation. It is crucial to correctly quantify their uncertainty in responding to given inputs. We develop a novel and practical framework, termed $Rank$-$Calibration$, to assess uncertainty and confidence measures for LMs.
arXiv Detail & Related papers (2024-04-04T02:31:05Z)
Measuring and Modeling Uncertainty Degree for Monocular Depth Estimation [50.920911532133154]
The intrinsic ill-posedness and ordinal-sensitive nature of monocular depth estimation (MDE) models pose major challenges to the estimation of uncertainty degree. We propose to model the uncertainty of MDE models from the perspective of the inherent probability distributions. By simply introducing additional training regularization terms, our model, with surprisingly simple formations and without requiring extra modules or multiple inferences, can provide uncertainty estimations with state-of-the-art reliability.
arXiv Detail & Related papers (2023-07-19T12:11:15Z)
Quantification of Uncertainty with Adversarial Models [6.772632213236167]
Quantifying uncertainty is important for actionable predictions in real-world applications. We suggest Quantification of Uncertainty with Adversarial Models (QUAM) QUAM identifies regions where the whole product under the integral is large, not just the posterior.
arXiv Detail & Related papers (2023-07-06T17:56:10Z)
Toward Reliable Human Pose Forecasting with Uncertainty [51.628234388046195]
We develop an open-source library for human pose forecasting, including multiple models, supporting several datasets. We devise two types of uncertainty in the problem to increase performance and convey better trust.
arXiv Detail & Related papers (2023-04-13T17:56:08Z)
The Implicit Delta Method [61.36121543728134]
In this paper, we propose an alternative, the implicit delta method, which works by infinitesimally regularizing the training loss of uncertainty. We show that the change in the evaluation due to regularization is consistent for the variance of the evaluation estimator, even when the infinitesimal change is approximated by a finite difference.
arXiv Detail & Related papers (2022-11-11T19:34:17Z)
Dense Uncertainty Estimation [62.23555922631451]
In this paper, we investigate neural networks and uncertainty estimation techniques to achieve both accurate deterministic prediction and reliable uncertainty estimation. We work on two types of uncertainty estimations solutions, namely ensemble based methods and generative model based methods, and explain their pros and cons while using them in fully/semi/weakly-supervised framework.
arXiv Detail & Related papers (2021-10-13T01:23:48Z)
Learning to Predict Error for MRI Reconstruction [67.76632988696943]
We demonstrate that predictive uncertainty estimated by the current methods does not highly correlate with prediction error. We propose a novel method that estimates the target labels and magnitude of the prediction error in two steps.
arXiv Detail & Related papers (2020-02-13T15:55:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.