Are Reasoning Models More Prone to Hallucination?
- URL: http://arxiv.org/abs/2505.23646v1
- Date: Thu, 29 May 2025 16:53:41 GMT
- Title: Are Reasoning Models More Prone to Hallucination?
- Authors: Zijun Yao, Yantao Liu, Yanxu Chen, Jianhui Chen, Junfeng Fang, Lei Hou, Juanzi Li, Tat-Seng Chua,
- Abstract summary: Recently evolved large reasoning models (LRMs) show powerful performance in solving complex tasks with long chain-of-thought (CoT) reasoning capability.<n>Are reasoning models more prone to hallucination?<n>This paper addresses the question from three perspectives.
- Score: 70.04436965009072
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently evolved large reasoning models (LRMs) show powerful performance in solving complex tasks with long chain-of-thought (CoT) reasoning capability. As these LRMs are mostly developed by post-training on formal reasoning tasks, whether they generalize the reasoning capability to help reduce hallucination in fact-seeking tasks remains unclear and debated. For instance, DeepSeek-R1 reports increased performance on SimpleQA, a fact-seeking benchmark, while OpenAI-o3 observes even severer hallucination. This discrepancy naturally raises the following research question: Are reasoning models more prone to hallucination? This paper addresses the question from three perspectives. (1) We first conduct a holistic evaluation for the hallucination in LRMs. Our analysis reveals that LRMs undergo a full post-training pipeline with cold start supervised fine-tuning (SFT) and verifiable reward RL generally alleviate their hallucination. In contrast, both distillation alone and RL training without cold start fine-tuning introduce more nuanced hallucinations. (2) To explore why different post-training pipelines alters the impact on hallucination in LRMs, we conduct behavior analysis. We characterize two critical cognitive behaviors that directly affect the factuality of a LRM: Flaw Repetition, where the surface-level reasoning attempts repeatedly follow the same underlying flawed logic, and Think-Answer Mismatch, where the final answer fails to faithfully match the previous CoT process. (3) Further, we investigate the mechanism behind the hallucination of LRMs from the perspective of model uncertainty. We find that increased hallucination of LRMs is usually associated with the misalignment between model uncertainty and factual accuracy. Our work provides an initial understanding of the hallucination in LRMs.
Related papers
- Joint Evaluation of Answer and Reasoning Consistency for Hallucination Detection in Large Reasoning Models [12.270274049887298]
Reasoning traces can be redundant or logically inconsistent, making them a new source of hallucination.<n>Existing hallucination detection methods focus primarily on answer-level uncertainty.<n>We propose RACE, a novel framework specifically tailored for hallucination detection in LRMs.
arXiv Detail & Related papers (2025-06-05T09:54:04Z) - MIRAGE: Assessing Hallucination in Multimodal Reasoning Chains of MLLM [58.2298313720146]
Multimodal hallucinations are multi-sourced and arise from diverse causes.<n>Existing benchmarks fail to adequately distinguish between perception-induced hallucinations and reasoning-induced hallucinations.
arXiv Detail & Related papers (2025-05-30T05:54:36Z) - Auditing Meta-Cognitive Hallucinations in Reasoning Large Language Models [8.97308732968526]
We study the causality of hallucinations under constrained knowledge domains by auditing the Chain-of-Thought trajectory.<n>Our analysis reveals that in long-CoT settings, RLLMs can iteratively reinforce biases and errors through flawed reflective reasoning.<n>Surprisingly, even direct interventions at the origin of hallucinations often fail to reverse their effects.
arXiv Detail & Related papers (2025-05-19T14:11:09Z) - Detection and Mitigation of Hallucination in Large Reasoning Models: A Mechanistic Perspective [11.013059864022667]
Reasoning Hallucinations are logically coherent but factually incorrect reasoning traces.<n>These errors are embedded within structured reasoning, making them more difficult to detect and potentially more harmful.<n>We propose the Reasoning Score, which quantifies the depth of reasoning by measuring the divergence between logits.<n>We also introduce GRPO-R, an enhanced reinforcement learning algorithm that incorporates step-level deep reasoning rewards via potential-based shaping.
arXiv Detail & Related papers (2025-05-19T09:16:40Z) - Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation [49.885797244626694]
hallucination of large multimodal models (LMMs) provides responses that appear correct but are actually incorrect.<n>This paper aims to study the hallucination problem of LMMs in video modality, which is dynamic and more challenging compared to static modalities like images and text.
arXiv Detail & Related papers (2025-03-25T13:12:17Z) - FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning [10.709365940160685]
Existing approaches primarily detect the presence of hallucinations but lack a nuanced understanding of their types and manifestations.
We introduce a comprehensive taxonomy that categorizes the common hallucinations in mathematical reasoning task into six types.
We then propose FG-PRM, an augmented model designed to detect and mitigate hallucinations in a fine-grained, step-level manner.
arXiv Detail & Related papers (2024-10-08T19:25:26Z) - Reefknot: A Comprehensive Benchmark for Relation Hallucination Evaluation, Analysis and Mitigation in Multimodal Large Language Models [13.48296910438554]
We introduce Reefknot, a comprehensive benchmark targeting relation hallucinations, comprising over 20,000 real-world samples.<n>We provide a systematic definition of relation hallucinations, integrating perceptive and cognitive perspectives, and construct a relation-based corpus using the Visual Genome scene graph dataset.<n>We propose a novel confidence-based mitigation strategy, which reduces the hallucination rate by an average of 9.75% across three datasets, including Reefknot.
arXiv Detail & Related papers (2024-08-18T10:07:02Z) - On Large Language Models' Hallucination with Regard to Known Facts [74.96789694959894]
Large language models are successful in answering factoid questions but are also prone to hallucination.
We investigate the phenomenon of LLMs possessing correct answer knowledge yet still hallucinating from the perspective of inference dynamics.
Our study shed light on understanding the reasons for LLMs' hallucinations on their known facts, and more importantly, on accurately predicting when they are hallucinating.
arXiv Detail & Related papers (2024-03-29T06:48:30Z) - Alleviating Hallucinations of Large Language Models through Induced
Hallucinations [67.35512483340837]
Large language models (LLMs) have been observed to generate responses that include inaccurate or fabricated information.
We propose a simple textitInduce-then-Contrast Decoding (ICD) strategy to alleviate hallucinations.
arXiv Detail & Related papers (2023-12-25T12:32:49Z) - Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination? [73.454943870226]
This work studies a specific type of hallucination induced by semantic associations.
To quantify this phenomenon, we propose a novel probing method and benchmark called EureQA.
arXiv Detail & Related papers (2023-11-16T09:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.