Can Linear Probes Measure LLM Uncertainty?
- URL: http://arxiv.org/abs/2510.04108v1
- Date: Sun, 05 Oct 2025 09:14:57 GMT
- Title: Can Linear Probes Measure LLM Uncertainty?
- Authors: Ramzi Dakhmouche, Adrien Letellier, Hossein Gorji,
- Abstract summary: Uncertainty Quantification (UQ) represents a key aspect for reliable deployment of Large Language Models (LLMs) in automated decision-making and beyond.<n>We show that taking a principled approach via Bayesian statistics leads to improved performance despite leveraging the simplest possible model, namely linear regression.<n>We infer the global uncertainty level of the LLM by identifying a sparse combination of distributional features, leading to an efficient UQ scheme.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Effective Uncertainty Quantification (UQ) represents a key aspect for reliable deployment of Large Language Models (LLMs) in automated decision-making and beyond. Yet, for LLM generation with multiple choice structure, the state-of-the-art in UQ is still dominated by the naive baseline given by the maximum softmax score. To address this shortcoming, we demonstrate that taking a principled approach via Bayesian statistics leads to improved performance despite leveraging the simplest possible model, namely linear regression. More precisely, we propose to train multiple Bayesian linear models, each predicting the output of a layer given the output of the previous one. Based on the obtained layer-level posterior distributions, we infer the global uncertainty level of the LLM by identifying a sparse combination of distributional features, leading to an efficient UQ scheme. Numerical experiments on various LLMs show consistent improvement over state-of-the-art baselines.
Related papers
- LLM-BI: Towards Fully Automated Bayesian Inference with Large Language Models [0.0]
This paper investigates the feasibility of using a Large Language Model (LLM) to automate the specification of prior distributions and likelihoods.<n>As a proof-of-concept, we present two experiments focused on Bayesian linear regression.<n>Our results validate the potential of LLMs to automate key steps in Bayesian modeling.
arXiv Detail & Related papers (2025-08-07T00:00:59Z) - Token-Level Density-Based Uncertainty Quantification Methods for Eliciting Truthfulness of Large Language Models [76.17975723711886]
Uncertainty quantification (UQ) is a prominent approach for eliciting truthful answers from large language models (LLMs)<n>In this work, we adapt Mahalanobis Distance (MD) - a well-established UQ technique in classification tasks - for text generation.<n>Our method extracts token embeddings from multiple layers of LLMs, computes MD scores for each token, and uses linear regression trained on these features to provide robust uncertainty scores.
arXiv Detail & Related papers (2025-02-20T10:25:13Z) - LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and Regularization [59.75242204923353]
We introduce LLM-Lasso, a framework that leverages large language models (LLMs) to guide feature selection in Lasso regression.<n>LLMs generate penalty factors for each feature, which are converted into weights for the Lasso penalty using a simple, tunable model.<n>Features identified as more relevant by the LLM receive lower penalties, increasing their likelihood of being retained in the final model.
arXiv Detail & Related papers (2025-02-15T02:55:22Z) - Uncertainty Quantification for LLMs through Minimum Bayes Risk: Bridging Confidence and Consistency [66.96286531087549]
Uncertainty quantification (UQ) methods for Large Language Models (LLMs) encompass a variety of approaches.<n>We propose a novel approach to integrating model confidence with output consistency, resulting in a family of efficient and robust UQ methods.<n>We evaluate our approach across various tasks such as question answering, abstractive summarization, and machine translation.
arXiv Detail & Related papers (2025-02-07T14:30:12Z) - Unconditional Truthfulness: Learning Unconditional Uncertainty of Large Language Models [104.55763564037831]
We train a regression model that leverages attention maps, probabilities on the current generation step, and recurrently computed uncertainty scores from previously generated tokens.<n>Our evaluation shows that the proposed method is highly effective for selective generation, achieving substantial improvements over rivaling unsupervised and supervised approaches.
arXiv Detail & Related papers (2024-08-20T09:42:26Z) - Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode.
We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z) - Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions.
We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training.
As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.