HalluShift: Measuring Distribution Shifts towards Hallucination Detection in LLMs
- URL: http://arxiv.org/abs/2504.09482v1
- Date: Sun, 13 Apr 2025 08:35:22 GMT
- Title: HalluShift: Measuring Distribution Shifts towards Hallucination Detection in LLMs
- Authors: Sharanya Dasgupta, Sujoy Nath, Arkaprabha Basu, Pourya Shamsolmoali, Swagatam Das,
- Abstract summary: Large Language Models (LLMs) have recently garnered widespread attention due to their adeptness at generating innovative responses to the given prompts.<n>In this work, we hypothesize that hallucinations stem from the internal dynamics of LLMs.<n>We introduce an innovative approach, HalluShift, designed to analyze the distribution shifts in the internal state space.
- Score: 14.005452985740849
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have recently garnered widespread attention due to their adeptness at generating innovative responses to the given prompts across a multitude of domains. However, LLMs often suffer from the inherent limitation of hallucinations and generate incorrect information while maintaining well-structured and coherent responses. In this work, we hypothesize that hallucinations stem from the internal dynamics of LLMs. Our observations indicate that, during passage generation, LLMs tend to deviate from factual accuracy in subtle parts of responses, eventually shifting toward misinformation. This phenomenon bears a resemblance to human cognition, where individuals may hallucinate while maintaining logical coherence, embedding uncertainty within minor segments of their speech. To investigate this further, we introduce an innovative approach, HalluShift, designed to analyze the distribution shifts in the internal state space and token probabilities of the LLM-generated responses. Our method attains superior performance compared to existing baselines across various benchmark datasets. Our codebase is available at https://github.com/sharanya-dasgupta001/hallushift.
Related papers
- Triggering Hallucinations in LLMs: A Quantitative Study of Prompt-Induced Hallucination in Large Language Models [0.0]
Hallucinations in large language models (LLMs) present a growing challenge across real-world applications.
We propose a prompt-based framework to systematically trigger and quantify hallucination.
arXiv Detail & Related papers (2025-05-01T14:33:47Z) - Can LLMs Detect Intrinsic Hallucinations in Paraphrasing and Machine Translation? [7.416552590139255]
We evaluate a suite of open-access LLMs on their ability to detect intrinsic hallucinations in two conditional generation tasks.
We study how model performance varies across tasks and language.
We find that performance varies across models but is consistent across prompts.
arXiv Detail & Related papers (2025-04-29T12:30:05Z) - Attention Reallocation: Towards Zero-cost and Controllable Hallucination Mitigation of MLLMs [62.9348974370985]
We propose attention reallocation (AttnReal) to mitigate hallucinations with nearly zero extra cost.
Our approach is motivated by the key observations that, MLLM's unreasonable attention distribution causes features to be dominated by historical output tokens.
Based on the observations, AttnReal recycles excessive attention from output tokens and reallocates it to visual tokens, which reduces MLLM's reliance on language priors.
arXiv Detail & Related papers (2025-03-11T11:52:37Z) - Combating Multimodal LLM Hallucination via Bottom-Up Holistic Reasoning [151.4060202671114]
multimodal large language models (MLLMs) have shown unprecedented capabilities in advancing vision-language tasks.<n>This paper introduces a novel bottom-up reasoning framework to address hallucinations in MLLMs.<n>Our framework systematically addresses potential issues in both visual and textual inputs by verifying and integrating perception-level information with cognition-level commonsense knowledge.
arXiv Detail & Related papers (2024-12-15T09:10:46Z) - Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models [70.19081534515371]
Large Language Models (LLMs) have gained widespread adoption in various natural language processing tasks.
They generate unfaithful or inconsistent content that deviates from the input source, leading to severe consequences.
We propose a robust discriminator named RelD to effectively detect hallucination in LLMs' generated answers.
arXiv Detail & Related papers (2024-07-04T18:47:42Z) - LLM Internal States Reveal Hallucination Risk Faced With a Query [62.29558761326031]
Humans have a self-awareness process that allows us to recognize what we don't know when faced with queries.
This paper investigates whether Large Language Models can estimate their own hallucination risk before response generation.
By a probing estimator, we leverage LLM self-assessment, achieving an average hallucination estimation accuracy of 84.32% at run time.
arXiv Detail & Related papers (2024-07-03T17:08:52Z) - Exploring and Evaluating Hallucinations in LLM-Powered Code Generation [14.438161741833687]
Large Language Models (LLMs) produce outputs that deviate from users' intent, exhibit internal inconsistencies, or misalign with factual knowledge.
Existing work mainly focuses on investing the hallucination in the domain of natural language generation.
We conduct a thematic analysis of the LLM-generated code to summarize and categorize the hallucinations present in it.
We propose HalluCode, a benchmark for evaluating the performance of code LLMs in recognizing hallucinations.
arXiv Detail & Related papers (2024-04-01T07:31:45Z) - Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models [68.91592125175787]
Hallucinations pose a significant challenge for the practical implementation of large language models (LLMs)
We present Rowen, a novel approach that enhances LLMs with a selective retrieval augmentation process tailored to address hallucinations.
arXiv Detail & Related papers (2024-02-16T11:55:40Z) - A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions [40.79317187623401]
The emergence of large language models (LLMs) has marked a significant breakthrough in natural language processing (NLP)
LLMs are prone to hallucination, generating plausible yet nonfactual content.
This phenomenon raises significant concerns over the reliability of LLMs in real-world information retrieval systems.
arXiv Detail & Related papers (2023-11-09T09:25:37Z) - Siren's Song in the AI Ocean: A Survey on Hallucination in Large
Language Models [116.01843550398183]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks.
LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge.
arXiv Detail & Related papers (2023-09-03T16:56:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.