Related papers: Cost-Effective Hallucination Detection for LLMs

Cost-Effective Hallucination Detection for LLMs

URL: http://arxiv.org/abs/2407.21424v2
Date: Fri, 9 Aug 2024 11:58:55 GMT
Title: Cost-Effective Hallucination Detection for LLMs
Authors: Simon Valentin, Jinmiao Fu, Gianluca Detommaso, Shaoyuan Xu, Giovanni Zappella, Bryan Wang,
Abstract summary: Large language models (LLMs) can be prone to hallucinations - generating unreliable outputs that are unfaithful to their inputs, external facts or internally inconsistent. Our pipeline for hallucination detection entails: first, producing a confidence score representing the likelihood that a generated answer is a hallucination; second, calibrating the score conditional on attributes of the inputs and candidate response; third, performing detection by thresholding the score.
Score: 11.58436181159839
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) can be prone to hallucinations - generating unreliable outputs that are unfaithful to their inputs, external facts or internally inconsistent. In this work, we address several challenges for post-hoc hallucination detection in production settings. Our pipeline for hallucination detection entails: first, producing a confidence score representing the likelihood that a generated answer is a hallucination; second, calibrating the score conditional on attributes of the inputs and candidate response; finally, performing detection by thresholding the calibrated score. We benchmark a variety of state-of-the-art scoring methods on different datasets, encompassing question answering, fact checking, and summarization tasks. We employ diverse LLMs to ensure a comprehensive assessment of performance. We show that calibrating individual scoring methods is critical for ensuring risk-aware downstream decision making. Based on findings that no individual score performs best in all situations, we propose a multi-scoring framework, which combines different scores and achieves top performance across all datasets. We further introduce cost-effective multi-scoring, which can match or even outperform more expensive detection methods, while significantly reducing computational overhead.

Related papers

HIDE and Seek: Detecting Hallucinations in Language Models via Decoupled Representations [17.673293240849787]
Contemporary Language Models (LMs) often generate content that is factually incorrect or unfaithful to the input context.<n>We propose a single-pass, training-free approach for effective Hallucination detectIon via Decoupled rEpresentations (HIDE)<n>Our results demonstrate that HIDE outperforms other single-pass methods in almost all settings.
arXiv Detail & Related papers (2025-06-21T16:02:49Z)
Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble Scorers [0.0]
Hallucinations are a persistent problem with Large Language Models (LLMs) We propose a versatile framework for zero-resource hallucination detection that practitioners can apply to real-world use cases. To enhance flexibility, we introduce a tunable ensemble approach that incorporates any combination of the individual confidence scores.
arXiv Detail & Related papers (2025-04-27T14:24:45Z)
HalluCounter: Reference-free LLM Hallucination Detection in the Wild! [6.5037356041929675]
HalluCounter is a reference-free hallucination detection method that utilizes both response-response and query-response consistency and alignment patterns. Our method outperforms state-of-the-art approaches by a significant margin, achieving over 90% average confidence in hallucination detection across datasets.
arXiv Detail & Related papers (2025-03-06T16:59:18Z)
SPARC: Score Prompting and Adaptive Fusion for Zero-Shot Multi-Label Recognition in Vision-Language Models [74.40683913645731]
Zero-shot multi-label recognition (MLR) with Vision-Language Models (VLMs) faces significant challenges without training data, model tuning, or architectural modifications. Our work proposes a novel solution treating VLMs as black boxes, leveraging scores without training data or ground truth. Analysis of these prompt scores reveals VLM biases and AND''/OR' signal ambiguities, notably that maximum scores are surprisingly suboptimal compared to second-highest scores.
arXiv Detail & Related papers (2025-02-24T07:15:05Z)
Attention-guided Self-reflection for Zero-shot Hallucination Detection in Large Language Models [20.175106988135454]
We introduce a novel Attention-Guided SElf-Reflection (AGSER) approach for zero-shot hallucination detection in Large Language Models (LLMs) The AGSER method utilizes attention contributions to categorize the input query into attentive and non-attentive queries. In addition to its efficacy in detecting hallucinations, AGSER notably reduces computational overhead, requiring only three passes through the LLM and utilizing two sets of tokens.
arXiv Detail & Related papers (2025-01-17T07:30:01Z)
Sample-agnostic Adversarial Perturbation for Vision-Language Pre-training Models [7.350203999073509]
Recent studies on AI security have highlighted the vulnerability of Vision-Language Pre-training models to subtle yet intentionally designed perturbations in images and texts. To the best of our knowledge, it is the first work through multimodal decision boundaries to explore the creation of a universal, sample-agnostic perturbation that applies to any image.
arXiv Detail & Related papers (2024-08-06T06:25:39Z)
Debiasing Multimodal Large Language Models [61.6896704217147]
Large Vision-Language Models (LVLMs) have become indispensable tools in computer vision and natural language processing. Our investigation reveals a noteworthy bias in the generated content, where the output is primarily influenced by the underlying Large Language Models (LLMs) prior to the input image. To rectify these biases and redirect the model's focus toward vision information, we introduce two simple, training-free strategies.
arXiv Detail & Related papers (2024-03-08T12:35:07Z)
HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs [0.0]
Hallucinations pose a significant challenge to the reliability and alignment of Large Language Models (LLMs) This paper introduces an automated scalable framework that combines benchmarking LLMs' hallucination tendencies with efficient hallucination detection. The framework is domain-agnostic, allowing the use of any language model for benchmark creation or evaluation in any domain.
arXiv Detail & Related papers (2024-02-25T22:23:37Z)
Self-Evaluation Improves Selective Generation in Large Language Models [54.003992911447696]
We reformulate open-ended generation tasks into token-level prediction tasks. We instruct an LLM to self-evaluate its answers. We benchmark a range of scoring methods based on self-evaluation.
arXiv Detail & Related papers (2023-12-14T19:09:22Z)
Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus [99.33091772494751]
Large Language Models (LLMs) have gained significant popularity for their impressive performance across diverse fields. LLMs are prone to hallucinate untruthful or nonsensical outputs that fail to meet user expectations. We propose a novel reference-free, uncertainty-based method for detecting hallucinations in LLMs.
arXiv Detail & Related papers (2023-11-22T08:39:17Z)
Chainpoll: A high efficacy method for LLM hallucination detection [0.0]
We introduce ChainPoll, an innovative hallucination detection method that excels compared to its counterparts. We also unveil RealHall, a refined collection of benchmark datasets to assess hallucination detection metrics from recent studies.
arXiv Detail & Related papers (2023-10-22T14:45:14Z)
FactCHD: Benchmarking Fact-Conflicting Hallucination Detection [64.4610684475899]
FactCHD is a benchmark designed for the detection of fact-conflicting hallucinations from LLMs. FactCHD features a diverse dataset that spans various factuality patterns, including vanilla, multi-hop, comparison, and set operation. We introduce Truth-Triangulator that synthesizes reflective considerations by tool-enhanced ChatGPT and LoRA-tuning based on Llama2.
arXiv Detail & Related papers (2023-10-18T16:27:49Z)
A New Benchmark and Reverse Validation Method for Passage-level Hallucination Detection [63.56136319976554]
Large Language Models (LLMs) generate hallucinations, which can cause significant damage when deployed for mission-critical tasks. We propose a self-check approach based on reverse validation to detect factual errors automatically in a zero-resource fashion. We empirically evaluate our method and existing zero-resource detection methods on two datasets.
arXiv Detail & Related papers (2023-10-10T10:14:59Z)
Detecting Hallucinated Content in Conditional Neural Sequence Generation [165.68948078624499]
We propose a task to predict whether each token in the output sequence is hallucinated (not contained in the input) We also introduce a method for learning to detect hallucinations using pretrained language models fine tuned on synthetic data.
arXiv Detail & Related papers (2020-11-05T00:18:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.