Pretrained LLMs Learn Multiple Types of Uncertainty
- URL: http://arxiv.org/abs/2505.21218v1
- Date: Tue, 27 May 2025 14:06:15 GMT
- Title: Pretrained LLMs Learn Multiple Types of Uncertainty
- Authors: Roi Cohen, Omri Fahn, Gerard de Melo,
- Abstract summary: Large Language Models are known to capture real-world knowledge, allowing them to excel in many downstream tasks.<n>In this work, we study how well LLMs capture uncertainty, without explicitly being trained for that.<n>We show that, if considering uncertainty as a linear concept in the model's latent space, it might indeed be captured, even after only pretraining.
- Score: 23.807232455808613
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models are known to capture real-world knowledge, allowing them to excel in many downstream tasks. Despite recent advances, these models are still prone to what are commonly known as hallucinations, causing them to emit unwanted and factually incorrect text. In this work, we study how well LLMs capture uncertainty, without explicitly being trained for that. We show that, if considering uncertainty as a linear concept in the model's latent space, it might indeed be captured, even after only pretraining. We further show that, though unintuitive, LLMs appear to capture several different types of uncertainty, each of which can be useful to predict the correctness for a specific task or benchmark. Furthermore, we provide in-depth results such as demonstrating a correlation between our correction prediction and the model's ability to abstain from misinformation using words, and the lack of impact of model scaling for capturing uncertainty. Finally, we claim that unifying the uncertainty types as a single one using instruction-tuning or [IDK]-token tuning is helpful for the model in terms of correctness prediction.
Related papers
- Are vision language models robust to uncertain inputs? [5.249651874118556]
We show that newer and larger vision language models exhibit improved robustness compared to earlier models, but still suffer from a tendency to strictly follow instructions.<n>For natural images such as ImageNet, this limitation can be overcome without pipeline modifications.<n>We propose a novel mechanism based on caption diversity to reveal a model's internal uncertainty.
arXiv Detail & Related papers (2025-05-17T03:16:49Z) - Uncertainty Distillation: Teaching Language Models to Express Semantic Confidence [16.311538811237536]
Large language models (LLMs) are increasingly used for factual question-answering.<n>For these verbalized expressions of uncertainty to be meaningful, they should reflect the error rates at the expressed level of confidence.<n>We propose a simple procedure, uncertainty distillation, to teach an LLM to calibrated semantic confidences.
arXiv Detail & Related papers (2025-03-18T21:29:29Z) - I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token [23.02504739114444]
Large Language Models are prone to hallucinations, causing them to emit unwanted and factually incorrect text.<n>We propose a novel calibration method that can be used to combat hallucinations.<n>We find that models trained with our method are able to express uncertainty in places where they would previously make mistakes.
arXiv Detail & Related papers (2024-12-09T17:13:20Z) - Predicting Emergent Capabilities by Finetuning [98.9684114851891]
We find that finetuning language models can shift the point in scaling at which emergence occurs towards less capable models.
We validate this approach using four standard NLP benchmarks.
We find that, in some cases, we can accurately predict whether models trained with up to 4x more compute have emerged.
arXiv Detail & Related papers (2024-11-25T01:48:09Z) - Uncertainties of Latent Representations in Computer Vision [2.33877878310217]
This thesis makes uncertainty estimates easily accessible by adding them to the latent representation vectors of pretrained computer vision models.
We show that these unobservable uncertainties about unobservable latent representations are indeed provably correct.
arXiv Detail & Related papers (2024-08-26T14:02:30Z) - Large Language Models Must Be Taught to Know What They Don't Know [97.90008709512921]
We show that fine-tuning on a small dataset of correct and incorrect answers can create an uncertainty estimate with good generalization and small computational overhead.<n>We also investigate the mechanisms that enable reliable uncertainty estimation, finding that many models can be used as general-purpose uncertainty estimators.
arXiv Detail & Related papers (2024-06-12T16:41:31Z) - Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode.
We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z) - Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification [116.77055746066375]
Large language models (LLMs) are notorious for hallucinating, i.e., producing erroneous claims in their output.
We propose a novel fact-checking and hallucination detection pipeline based on token-level uncertainty quantification.
arXiv Detail & Related papers (2024-03-07T17:44:17Z) - Uncertainty Quantification for In-Context Learning of Large Language Models [52.891205009620364]
In-context learning has emerged as a groundbreaking ability of Large Language Models (LLMs)
We propose a novel formulation and corresponding estimation method to quantify both types of uncertainties.
The proposed method offers an unsupervised way to understand the prediction of in-context learning in a plug-and-play fashion.
arXiv Detail & Related papers (2024-02-15T18:46:24Z) - Selective Learning: Towards Robust Calibration with Dynamic Regularization [79.92633587914659]
Miscalibration in deep learning refers to there is a discrepancy between the predicted confidence and performance.
We introduce Dynamic Regularization (DReg) which aims to learn what should be learned during training thereby circumventing the confidence adjusting trade-off.
arXiv Detail & Related papers (2024-02-13T11:25:20Z) - Distinguishing the Knowable from the Unknowable with Language Models [15.471748481627143]
In the absence of ground-truth probabilities, we explore a setting where, in order to disentangle a given uncertainty, a significantly larger model stands in as a proxy for the ground truth.
We show that small linear probes trained on the embeddings of frozen, pretrained models accurately predict when larger models will be more confident at the token level.
We propose a fully unsupervised method that achieves non-trivial accuracy on the same task.
arXiv Detail & Related papers (2024-02-05T22:22:49Z) - Post-hoc Uncertainty Learning using a Dirichlet Meta-Model [28.522673618527417]
We propose a novel Bayesian meta-model to augment pre-trained models with better uncertainty quantification abilities.
Our proposed method requires no additional training data and is flexible enough to quantify different uncertainties.
We demonstrate our proposed meta-model approach's flexibility and superior empirical performance on these applications.
arXiv Detail & Related papers (2022-12-14T17:34:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.