Understanding New-Knowledge-Induced Factual Hallucinations in LLMs: Analysis, Solution, and Interpretation
- URL: http://arxiv.org/abs/2511.02626v1
- Date: Tue, 04 Nov 2025 14:55:24 GMT
- Title: Understanding New-Knowledge-Induced Factual Hallucinations in LLMs: Analysis, Solution, and Interpretation
- Authors: Renfei Dang, Peng Hu, Changjiang Gao, Shujian Huang,
- Abstract summary: Previous studies show that introducing new knowledge during large language models (LLMs) fine-tuning can lead to the generation of erroneous output when tested on known information.<n>We conduct a fine-grained analysis across multiple knowledge types and two task types, including knowledge question answering (QA) and knowledge reasoning tasks.<n>We find that when fine-tuned on a dataset in which a specific knowledge type consists entirely of new knowledge, LLMs exhibit significantly increased hallucinations tendencies.<n>We propose KnownPatch, which patches a small number of known knowledge samples in the later stages of training, effectively alleviating new-knowledge-induced hallucinations
- Score: 41.83870063693278
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Previous studies show that introducing new knowledge during large language models (LLMs) fine-tuning can lead to the generation of erroneous output when tested on known information, thereby triggering factual hallucinations. However, existing studies have not deeply investigated the specific manifestations and underlying mechanisms of these hallucinations. Our work addresses this gap by designing a controlled dataset Biography-Reasoning, and conducting a fine-grained analysis across multiple knowledge types and two task types, including knowledge question answering (QA) and knowledge reasoning tasks. We find that when fine-tuned on a dataset in which a specific knowledge type consists entirely of new knowledge, LLMs exhibit significantly increased hallucination tendencies. This suggests that the high unfamiliarity of a particular knowledge type, rather than the overall proportion of new knowledge, is a stronger driver of hallucinations, and these tendencies can even affect other knowledge types in QA tasks. To mitigate such factual hallucinations, we propose KnownPatch, which patches a small number of known knowledge samples in the later stages of training, effectively alleviating new-knowledge-induced hallucinations. Through attention analysis, we find that learning new knowledge reduces the model's attention to key entities in the question, thus causing excessive focus on the surrounding context, which may increase the risk of hallucination. Moreover, the attention pattern can propagate to similar contexts, facilitating the spread of hallucinations to textually similar questions. Our method effectively mitigates the disruption of new knowledge learning to the model's attention on key entities, accompanied by improved performance.
Related papers
- HACK: Hallucinations Along Certainty and Knowledge Axes [66.66625343090743]
We propose a framework for categorizing hallucinations along two axes: knowledge and certainty.<n>We identify a particularly concerning subset of hallucinations where models hallucinate with certainty despite having the correct knowledge internally.
arXiv Detail & Related papers (2025-10-28T09:34:31Z) - Pierce the Mists, Greet the Sky: Decipher Knowledge Overshadowing via Knowledge Circuit Analysis [41.744536797387376]
Large Language Models (LLMs) are hampered by hallucinations.<n> knowledge overshadowing occurs when one piece of activated knowledge inadvertently masks another relevant piece.<n>We introduce PhantomCircuit, a framework designed to comprehensively analyze and detect knowledge overshadowing.
arXiv Detail & Related papers (2025-05-20T14:20:30Z) - KSHSeek: Data-Driven Approaches to Mitigating and Detecting Knowledge-Shortcut Hallucinations in Generative Models [17.435794516702256]
Large language models (LLMs) have significantly advanced the development of natural language processing (NLP)<n>Model hallucinations remain a major challenge in natural language generation (NLG) tasks due to their complex causes.<n>This work introduces a new paradigm for mitigating specific hallucination issues in generative models, enhancing their robustness and reliability in real-world applications.
arXiv Detail & Related papers (2025-03-25T09:18:27Z) - The Law of Knowledge Overshadowing: Towards Understanding, Predicting, and Preventing LLM Hallucination [85.18584652829799]
We introduce a novel framework to quantify factual hallucinations by modeling knowledge overshadowing.<n>We propose a new decoding strategy CoDa, to mitigate hallucinations, which notably enhance model factuality on Overshadow (27.9%), MemoTrap (13.1%) and NQ-Swap (18.3%)
arXiv Detail & Related papers (2025-02-22T08:36:06Z) - Combating Multimodal LLM Hallucination via Bottom-Up Holistic Reasoning [151.4060202671114]
multimodal large language models (MLLMs) have shown unprecedented capabilities in advancing vision-language tasks.<n>This paper introduces a novel bottom-up reasoning framework to address hallucinations in MLLMs.<n>Our framework systematically addresses potential issues in both visual and textual inputs by verifying and integrating perception-level information with cognition-level commonsense knowledge.
arXiv Detail & Related papers (2024-12-15T09:10:46Z) - Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? [33.702498916775426]
We study the impact of new knowledge on the capability of the fine-tuned model to utilize its pre-existing knowledge.
We demonstrate that large language models struggle to acquire new factual knowledge through fine-tuning.
As the examples with new knowledge are eventually learned, they linearly increase the model's tendency to hallucinate.
arXiv Detail & Related papers (2024-05-09T17:00:22Z) - Does the Generator Mind its Contexts? An Analysis of Generative Model
Faithfulness under Context Transfer [42.081311699224585]
The present study introduces the knowledge-augmented generator, which is specifically designed to produce information that remains grounded in contextual knowledge.
Our objective is to explore the existence of hallucinations arising from parametric memory when contextual knowledge undergoes changes.
arXiv Detail & Related papers (2024-02-22T12:26:07Z) - Knowledge Verification to Nip Hallucination in the Bud [69.79051730580014]
We demonstrate the feasibility of mitigating hallucinations by verifying and minimizing the inconsistency between external knowledge present in the alignment data and the intrinsic knowledge embedded within foundation LLMs.
We propose a novel approach called Knowledge Consistent Alignment (KCA), which employs a well-aligned LLM to automatically formulate assessments based on external knowledge.
We demonstrate the superior efficacy of KCA in reducing hallucinations across six benchmarks, utilizing foundation LLMs of varying backbones and scales.
arXiv Detail & Related papers (2024-01-19T15:39:49Z) - Towards Mitigating Hallucination in Large Language Models via
Self-Reflection [63.2543947174318]
Large language models (LLMs) have shown promise for generative and knowledge-intensive tasks including question-answering (QA) tasks.
This paper analyses the phenomenon of hallucination in medical generative QA systems using widely adopted LLMs and datasets.
arXiv Detail & Related papers (2023-10-10T03:05:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.