The Curious Case of Hallucinatory (Un)answerability: Finding Truths in
the Hidden States of Over-Confident Large Language Models
- URL: http://arxiv.org/abs/2310.11877v2
- Date: Sun, 12 Nov 2023 13:14:36 GMT
- Title: The Curious Case of Hallucinatory (Un)answerability: Finding Truths in
the Hidden States of Over-Confident Large Language Models
- Authors: Aviv Slobodkin, Omer Goldman, Avi Caciularu, Ido Dagan, Shauli
Ravfogel
- Abstract summary: We study the behavior of large language models (LLMs) when presented with (un)answerable queries.
Our results show strong indications that such models encode the answerability of an input query, with the representation of the first decoded token often being a strong indicator.
- Score: 46.990141872509476
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have been shown to possess impressive
capabilities, while also raising crucial concerns about the faithfulness of
their responses. A primary issue arising in this context is the management of
(un)answerable queries by LLMs, which often results in hallucinatory behavior
due to overconfidence. In this paper, we explore the behavior of LLMs when
presented with (un)answerable queries. We ask: do models represent the fact
that the question is (un)answerable when generating a hallucinatory answer? Our
results show strong indications that such models encode the answerability of an
input query, with the representation of the first decoded token often being a
strong indicator. These findings shed new light on the spatial organization
within the latent representations of LLMs, unveiling previously unexplored
facets of these models. Moreover, they pave the way for the development of
improved decoding techniques with better adherence to factual generation,
particularly in scenarios where query (un)answerability is a concern.
Related papers
- Knowledge Graphs, Large Language Models, and Hallucinations: An NLP Perspective [5.769786334333616]
Large Language Models (LLMs) have revolutionized Natural Language Processing (NLP) based applications including automated text generation, question answering, and others.
They face a significant challenge: hallucinations, where models produce plausible-sounding but factually incorrect responses.
This paper discusses these open challenges covering state-of-the-art datasets and benchmarks as well as methods for knowledge integration and evaluating hallucinations.
arXiv Detail & Related papers (2024-11-21T16:09:05Z) - Disentangling Memory and Reasoning Ability in Large Language Models [97.26827060106581]
We propose a new inference paradigm that decomposes the complex inference process into two distinct and clear actions.
Our experiment results show that this decomposition improves model performance and enhances the interpretability of the inference process.
arXiv Detail & Related papers (2024-11-20T17:55:38Z) - Information Anxiety in Large Language Models [21.574677910096735]
Large Language Models (LLMs) have demonstrated strong performance as knowledge repositories.
We take the investigation further by conducting a comprehensive analysis of the internal reasoning and retrieval mechanisms of LLMs.
Our work focuses on three critical dimensions - the impact of entity popularity, the models' sensitivity to lexical variations in query formulation, and the progression of hidden state representations.
arXiv Detail & Related papers (2024-11-16T14:28:33Z) - How Susceptible are LLMs to Influence in Prompts? [6.644673474240519]
Large Language Models (LLMs) are highly sensitive to prompts, including additional context provided therein.
We study how an LLM's response to multiple-choice questions changes when the prompt includes a prediction and explanation from another model.
Our findings reveal that models are strongly influenced, and when explanations are provided they are swayed irrespective of the quality of the explanation.
arXiv Detail & Related papers (2024-08-17T17:40:52Z) - Understanding the Relationship between Prompts and Response Uncertainty in Large Language Models [55.332004960574004]
Large language models (LLMs) are widely used in decision-making, but their reliability, especially in critical tasks like healthcare, is not well-established.
This paper investigates how the uncertainty of responses generated by LLMs relates to the information provided in the input prompt.
We propose a prompt-response concept model that explains how LLMs generate responses and helps understand the relationship between prompts and response uncertainty.
arXiv Detail & Related papers (2024-07-20T11:19:58Z) - Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models [70.19081534515371]
Large Language Models (LLMs) have gained widespread adoption in various natural language processing tasks.
They generate unfaithful or inconsistent content that deviates from the input source, leading to severe consequences.
We propose a robust discriminator named RelD to effectively detect hallucination in LLMs' generated answers.
arXiv Detail & Related papers (2024-07-04T18:47:42Z) - You don't need a personality test to know these models are unreliable: Assessing the Reliability of Large Language Models on Psychometric Instruments [37.03210795084276]
We examine whether the current format of prompting Large Language Models elicits responses in a consistent and robust manner.
Our experiments on 17 different LLMs reveal that even simple perturbations significantly downgrade a model's question-answering ability.
Our results suggest that the currently widespread practice of prompting is insufficient to accurately and reliably capture model perceptions.
arXiv Detail & Related papers (2023-11-16T09:50:53Z) - Are Large Language Models Really Robust to Word-Level Perturbations? [68.60618778027694]
We propose a novel rational evaluation approach that leverages pre-trained reward models as diagnostic tools.
Longer conversations manifest the comprehensive grasp of language models in terms of their proficiency in understanding questions.
Our results demonstrate that LLMs frequently exhibit vulnerability to word-level perturbations that are commonplace in daily language usage.
arXiv Detail & Related papers (2023-09-20T09:23:46Z) - Siren's Song in the AI Ocean: A Survey on Hallucination in Large
Language Models [116.01843550398183]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks.
LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge.
arXiv Detail & Related papers (2023-09-03T16:56:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.