Related papers: Uncertainty-Based Abstention in LLMs Improves Safety and Reduces Hallucinations

Uncertainty-Based Abstention in LLMs Improves Safety and Reduces Hallucinations

URL: http://arxiv.org/abs/2404.10960v1
Date: Tue, 16 Apr 2024 23:56:38 GMT
Title: Uncertainty-Based Abstention in LLMs Improves Safety and Reduces Hallucinations
Authors: Christian Tomani, Kamalika Chaudhuri, Ivan Evtimov, Daniel Cremers, Mark Ibrahim,
Abstract summary: A major barrier towards the practical deployment of large language models (LLMs) is their lack of reliability. Three situations where this is particularly apparent are correctness, hallucinations when given unanswerable questions, and safety. In all three cases, models should ideally abstain from responding, much like humans, whose ability to understand uncertainty makes us refrain from answering questions we don't know.
Score: 63.330182403615886
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A major barrier towards the practical deployment of large language models (LLMs) is their lack of reliability. Three situations where this is particularly apparent are correctness, hallucinations when given unanswerable questions, and safety. In all three cases, models should ideally abstain from responding, much like humans, whose ability to understand uncertainty makes us refrain from answering questions we don't know. Inspired by analogous approaches in classification, this study explores the feasibility and efficacy of abstaining while uncertain in the context of LLMs within the domain of question-answering. We investigate two kinds of uncertainties, statistical uncertainty metrics and a distinct verbalized measure, termed as In-Dialogue Uncertainty (InDU). Using these uncertainty measures combined with models with and without Reinforcement Learning with Human Feedback (RLHF), we show that in all three situations, abstention based on the right kind of uncertainty measure can boost the reliability of LLMs. By sacrificing only a few highly uncertain samples we can improve correctness by 2% to 8%, avoid 50% hallucinations via correctly identifying unanswerable questions and increase safety by 70% up to 99% with almost no additional computational overhead.

Related papers

Do not Abstain! Identify and Solve the Uncertainty [25.744791822890036]
We introduce bftextConfuseBench, a benchmark mainly focus on three types of uncertainty: document scarcity, limited capability, and query ambiguity.<n>Experiments reveal that current LLMs struggle to accurately identify the root cause of uncertainty and solve it.<n>We first generate context-aware inquiries that highlight the confusing aspect of the original query.<n>Then we judge the source of uncertainty based on the uniqueness of the inquiry's answer.
arXiv Detail & Related papers (2025-06-01T02:15:17Z)
Uncertainty Distillation: Teaching Language Models to Express Semantic Confidence [16.311538811237536]
Large language models (LLMs) are increasingly used for factual question-answering. For these verbalized expressions of uncertainty to be meaningful, they should reflect the error rates at the expressed level of confidence. Many prior methods calculate lexical uncertainty, estimating a model's confidence in the specific string it generated.
arXiv Detail & Related papers (2025-03-18T21:29:29Z)
Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations [51.92795774118647]
We find that verbal uncertainty'' is governed by a single linear feature in the representation space of LLMs. We show that this has only moderate correlation with the actual semantic uncertainty'' of the model.
arXiv Detail & Related papers (2025-03-18T17:51:04Z)
Estimating LLM Uncertainty with Evidence [66.51144261657983]
We present Logits-induced token uncertainty (LogTokU) as a framework for estimating decoupled token uncertainty in Large Language Models.<n>We employ evidence modeling to implement LogTokU and use the estimated uncertainty to guide downstream tasks.
arXiv Detail & Related papers (2025-02-01T03:18:02Z)
VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation [18.873512856021357]
We introduce VL-Uncertainty, the first uncertainty-based framework for detecting hallucinations in large vision-language models. We measure uncertainty by analyzing the prediction variance across semantically equivalent but perturbed prompts. When LVLMs are highly confident, they provide consistent responses to semantically equivalent queries. But, when uncertain, the responses of the target LVLM become more random.
arXiv Detail & Related papers (2024-11-18T04:06:04Z)
LoGU: Long-form Generation with Uncertainty Expressions [49.76417603761989]
We introduce the task of Long-form Generation with Uncertainty(LoGU) We identify two key challenges: Uncertainty Suppression and Uncertainty Misalignment. Our framework adopts a divide-and-conquer strategy, refining uncertainty based on atomic claims. Experiments on three long-form instruction following datasets show that our method significantly improves accuracy, reduces hallucinations, and maintains the comprehensiveness of responses.
arXiv Detail & Related papers (2024-10-18T09:15:35Z)
MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty [10.154013836043816]
We propose a new Multi-Answer Question Answering dataset, MAQA, consisting of world knowledge, mathematical reasoning, and commonsense reasoning tasks. Our findings show that entropy and consistency-based methods estimate the model uncertainty well even under data uncertainty. We believe our observations will pave the way for future work on uncertainty quantification in realistic setting.
arXiv Detail & Related papers (2024-08-13T11:17:31Z)
Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness [106.52630978891054]
We present a taxonomy of uncertainty specific to vision-language AI systems. We also introduce a new metric confidence-weighted accuracy, that is well correlated with both accuracy and calibration error.
arXiv Detail & Related papers (2024-07-02T04:23:54Z)
Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities [79.9629927171974]
Uncertainty in Large Language Models (LLMs) is crucial for applications where safety and reliability are important. We propose Kernel Language Entropy (KLE), a novel method for uncertainty estimation in white- and black-box LLMs.
arXiv Detail & Related papers (2024-05-30T12:42:05Z)
Semantic Density: Uncertainty Quantification for Large Language Models through Confidence Measurement in Semantic Space [14.715989394285238]
Existing Large Language Models (LLMs) do not have an inherent functionality to provide the users with an uncertainty/confidence metric for each response it generates. A new framework is proposed in this paper to address these issues. Semantic density extracts uncertainty/confidence information for each response from a probability distribution perspective in semantic space.
arXiv Detail & Related papers (2024-05-22T17:13:49Z)
Uncertainty Estimation and Quantification for LLMs: A Simple Supervised Approach [6.209293868095268]
We study the problem of uncertainty estimation and calibration for LLMs. We propose a supervised approach that leverages labeled datasets to estimate the uncertainty in LLMs' responses. Our method is easy to implement and adaptable to different levels of model accessibility including black box, grey box, and white box.
arXiv Detail & Related papers (2024-04-24T17:10:35Z)
Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge [35.067234242461545]
Large language models (LLMs) express uncertainty in situations where they lack sufficient parametric knowledge to generate reasonable responses. This work aims to systematically investigate LLMs' behaviors in such situations, emphasizing the trade-off between honesty and helpfulness.
arXiv Detail & Related papers (2023-11-16T10:02:40Z)
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability. In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling. Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z)
Improving the Reliability of Large Language Models by Leveraging Uncertainty-Aware In-Context Learning [76.98542249776257]
Large-scale language models often face the challenge of "hallucination" We introduce an uncertainty-aware in-context learning framework to empower the model to enhance or reject its output in response to uncertainty.
arXiv Detail & Related papers (2023-10-07T12:06:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.