The Curious Case of Hallucinatory (Un)answerability: Finding Truths in
the Hidden States of Over-Confident Large Language Models
- URL: http://arxiv.org/abs/2310.11877v2
- Date: Sun, 12 Nov 2023 13:14:36 GMT
- Title: The Curious Case of Hallucinatory (Un)answerability: Finding Truths in
the Hidden States of Over-Confident Large Language Models
- Authors: Aviv Slobodkin, Omer Goldman, Avi Caciularu, Ido Dagan, Shauli
Ravfogel
- Abstract summary: We study the behavior of large language models (LLMs) when presented with (un)answerable queries.
Our results show strong indications that such models encode the answerability of an input query, with the representation of the first decoded token often being a strong indicator.
- Score: 46.990141872509476
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have been shown to possess impressive
capabilities, while also raising crucial concerns about the faithfulness of
their responses. A primary issue arising in this context is the management of
(un)answerable queries by LLMs, which often results in hallucinatory behavior
due to overconfidence. In this paper, we explore the behavior of LLMs when
presented with (un)answerable queries. We ask: do models represent the fact
that the question is (un)answerable when generating a hallucinatory answer? Our
results show strong indications that such models encode the answerability of an
input query, with the representation of the first decoded token often being a
strong indicator. These findings shed new light on the spatial organization
within the latent representations of LLMs, unveiling previously unexplored
facets of these models. Moreover, they pave the way for the development of
improved decoding techniques with better adherence to factual generation,
particularly in scenarios where query (un)answerability is a concern.
Related papers
- How Susceptible are LLMs to Influence in Prompts? [6.644673474240519]
Large Language Models (LLMs) are highly sensitive to prompts, including additional context provided therein.
We study how an LLM's response to multiple-choice questions changes when the prompt includes a prediction and explanation from another model.
Our findings reveal that models are strongly influenced, and when explanations are provided they are swayed irrespective of the quality of the explanation.
arXiv Detail & Related papers (2024-08-17T17:40:52Z) - Understanding the Relationship between Prompts and Response Uncertainty in Large Language Models [55.332004960574004]
Large language models (LLMs) are widely used in decision-making, but their reliability, especially in critical tasks like healthcare, is not well-established.
This paper investigates how the uncertainty of responses generated by LLMs relates to the information provided in the input prompt.
We propose a prompt-response concept model that explains how LLMs generate responses and helps understand the relationship between prompts and response uncertainty.
arXiv Detail & Related papers (2024-07-20T11:19:58Z) - Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models [70.19081534515371]
Large Language Models (LLMs) have gained widespread adoption in various natural language processing tasks.
They generate unfaithful or inconsistent content that deviates from the input source, leading to severe consequences.
We propose a robust discriminator named RelD to effectively detect hallucination in LLMs' generated answers.
arXiv Detail & Related papers (2024-07-04T18:47:42Z) - Evaluating Consistency and Reasoning Capabilities of Large Language Models [0.0]
Large Language Models (LLMs) are extensively used today across various sectors, including academia, research, business, and finance.
Despite their widespread adoption, these models often produce incorrect and misleading information, exhibiting a tendency to hallucinate.
This paper aims to evaluate and compare the consistency and reasoning capabilities of both public and proprietary LLMs.
arXiv Detail & Related papers (2024-04-25T10:03:14Z) - You don't need a personality test to know these models are unreliable: Assessing the Reliability of Large Language Models on Psychometric Instruments [37.03210795084276]
We examine whether the current format of prompting Large Language Models elicits responses in a consistent and robust manner.
Our experiments on 17 different LLMs reveal that even simple perturbations significantly downgrade a model's question-answering ability.
Our results suggest that the currently widespread practice of prompting is insufficient to accurately and reliably capture model perceptions.
arXiv Detail & Related papers (2023-11-16T09:50:53Z) - Towards Mitigating Hallucination in Large Language Models via
Self-Reflection [63.2543947174318]
Large language models (LLMs) have shown promise for generative and knowledge-intensive tasks including question-answering (QA) tasks.
This paper analyses the phenomenon of hallucination in medical generative QA systems using widely adopted LLMs and datasets.
arXiv Detail & Related papers (2023-10-10T03:05:44Z) - Probing the Multi-turn Planning Capabilities of LLMs via 20 Question
Games [14.063311955315077]
Large language models (LLMs) are effective at answering questions that are clearly asked.
When faced with ambiguous queries they can act unpredictably and produce incorrect outputs.
This underscores the need for the development of intelligent agents capable of asking clarification questions to resolve ambiguities effectively.
arXiv Detail & Related papers (2023-10-02T16:55:37Z) - Are Large Language Models Really Robust to Word-Level Perturbations? [68.60618778027694]
We propose a novel rational evaluation approach that leverages pre-trained reward models as diagnostic tools.
Longer conversations manifest the comprehensive grasp of language models in terms of their proficiency in understanding questions.
Our results demonstrate that LLMs frequently exhibit vulnerability to word-level perturbations that are commonplace in daily language usage.
arXiv Detail & Related papers (2023-09-20T09:23:46Z) - Siren's Song in the AI Ocean: A Survey on Hallucination in Large
Language Models [116.01843550398183]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks.
LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge.
arXiv Detail & Related papers (2023-09-03T16:56:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.