Fugu-MT 論文翻訳(概要): Understanding New-Knowledge-Induced Factual Hallucinations in LLMs: Analysis, Solution, and Interpretation

論文の概要: Understanding New-Knowledge-Induced Factual Hallucinations in LLMs: Analysis, Solution, and Interpretation

arxiv url: http://arxiv.org/abs/2511.02626v1
Date: Tue, 04 Nov 2025 14:55:24 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-05 18:47:06.08297
Title: Understanding New-Knowledge-Induced Factual Hallucinations in LLMs: Analysis, Solution, and Interpretation
Title（参考訳）: LLMにおける新しい知識によるファクチュアル・ハロシンの理解:分析,解法,解釈
Authors: Renfei Dang, Peng Hu, Changjiang Gao, Shujian Huang,
Abstract要約: 従来の研究では、大規模言語モデル(LLM)の微調整中に新しい知識を導入することで、既知の情報に基づいてテストした場合、誤った出力が発生することが示されている。我々は,知識質問応答(QA)と知識推論タスクを含む,複数の知識タイプと2つのタスクタイプにわたるきめ細かい分析を行う。特定の知識タイプがすべて新しい知識で構成されているデータセットに微調整を行うと、LLMは幻覚の傾向を著しく高める。我々は,学習後期に少数の知識サンプルをパッチし,新しい知識による幻覚を効果的に緩和するKnownPatchを提案する。
参考スコア（独自算出の注目度）: 41.83870063693278
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Previous studies show that introducing new knowledge during large language models (LLMs) fine-tuning can lead to the generation of erroneous output when tested on known information, thereby triggering factual hallucinations. However, existing studies have not deeply investigated the specific manifestations and underlying mechanisms of these hallucinations. Our work addresses this gap by designing a controlled dataset Biography-Reasoning, and conducting a fine-grained analysis across multiple knowledge types and two task types, including knowledge question answering (QA) and knowledge reasoning tasks. We find that when fine-tuned on a dataset in which a specific knowledge type consists entirely of new knowledge, LLMs exhibit significantly increased hallucination tendencies. This suggests that the high unfamiliarity of a particular knowledge type, rather than the overall proportion of new knowledge, is a stronger driver of hallucinations, and these tendencies can even affect other knowledge types in QA tasks. To mitigate such factual hallucinations, we propose KnownPatch, which patches a small number of known knowledge samples in the later stages of training, effectively alleviating new-knowledge-induced hallucinations. Through attention analysis, we find that learning new knowledge reduces the model's attention to key entities in the question, thus causing excessive focus on the surrounding context, which may increase the risk of hallucination. Moreover, the attention pattern can propagate to similar contexts, facilitating the spread of hallucinations to textually similar questions. Our method effectively mitigates the disruption of new knowledge learning to the model's attention on key entities, accompanied by improved performance.
Abstract（参考訳）: 従来の研究では、大規模言語モデル(LLM)の微調整中に新しい知識を導入することで、既知の情報でテストした場合に誤った出力が発生する可能性があることが示されており、事実の幻覚を引き起こす可能性がある。しかし、既存の研究では、これらの幻覚の具体的な発現とメカニズムについて深く研究されていない。我々の研究は、制御されたデータセットのバイオグラフィー推論を設計し、知識質問応答(QA)や知識推論タスクを含む複数の知識タイプと2つのタスクタイプにわたるきめ細かい分析を行うことによって、このギャップに対処する。特定の知識タイプがすべて新しい知識で構成されたデータセットに微調整を施すと、LLMは幻覚傾向が著しく増大することがわかった。これは、新しい知識の全体比率よりも、特定の知識タイプの不慣れ度が高いことが幻覚の強い要因であり、これらの傾向がQAタスクの他の知識タイプにも影響を及ぼすことを示唆している。このような事実の幻覚を軽減するために,学習後期に少数の知識サンプルをパッチし,新たな知識による幻覚を効果的に緩和するKnownPatchを提案する。注意分析により,新たな知識を学習することで,問題の主要エンティティに対するモデルの注意を減らし,周囲のコンテキストに過度な集中を生じさせることで,幻覚のリスクが増大することを見出した。さらに、注意パターンは同様の文脈に伝播し、幻覚をテキスト的に類似した質問に拡散させる。提案手法は,新たな知識学習の中断を効果的に軽減し,モデルが重要なエンティティに注目する上で,性能の向上を伴う。

論文の概要: Understanding New-Knowledge-Induced Factual Hallucinations in LLMs: Analysis, Solution, and Interpretation

関連論文リスト