Fugu-MT 論文翻訳(概要): D$^2$HScore: Reasoning-Aware Hallucination Detection via Semantic Breadth and Depth Analysis in LLMs

論文の概要: D$^2$HScore: Reasoning-Aware Hallucination Detection via Semantic Breadth and Depth Analysis in LLMs

arxiv url: http://arxiv.org/abs/2509.11569v1
Date: Mon, 15 Sep 2025 04:28:38 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-16 17:26:23.144879
Title: D$^2$HScore: Reasoning-Aware Hallucination Detection via Semantic Breadth and Depth Analysis in LLMs
Title（参考訳）: D$^2$HScore:LLMのセマンティックブレッドスと深さ解析による推論型幻覚検出
Authors: Yue Ding, Xiaofang Zhu, Tianze Xia, Junfei Wu, Xinlong Chen, Qiang Liu, Liang Wang,
Abstract要約: この研究は、モデルアーキテクチャと生成ダイナミクスの観点から幻覚検出を再考する。 textbfD$2$HScore (Dispersion and Drift-based Hallucination Score) を提案する。 5つのオープンソースのLanguage Modelと5つの広く使用されているベンチマークの実験は、D$2$HScoreが既存のトレーニング不要のベースラインを一貫して上回っていることを示している。
参考スコア（独自算出の注目度）: 15.665202830841046
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although large Language Models (LLMs) have achieved remarkable success, their practical application is often hindered by the generation of non-factual content, which is called "hallucination". Ensuring the reliability of LLMs' outputs is a critical challenge, particularly in high-stakes domains such as finance, security, and healthcare. In this work, we revisit hallucination detection from the perspective of model architecture and generation dynamics. Leveraging the multi-layer structure and autoregressive decoding process of LLMs, we decompose hallucination signals into two complementary dimensions: the semantic breadth of token representations within each layer, and the semantic depth of core concepts as they evolve across layers. Based on this insight, we propose \textbf{D$^2$HScore (Dispersion and Drift-based Hallucination Score)}, a training-free and label-free framework that jointly measures: (1) \textbf{Intra-Layer Dispersion}, which quantifies the semantic diversity of token representations within each layer; and (2) \textbf{Inter-Layer Drift}, which tracks the progressive transformation of key token representations across layers. To ensure drift reflects the evolution of meaningful semantics rather than noisy or redundant tokens, we guide token selection using attention signals. By capturing both the horizontal and vertical dynamics of representation during inference, D$^2$HScore provides an interpretable and lightweight proxy for hallucination detection. Extensive experiments across five open-source LLMs and five widely used benchmarks demonstrate that D$^2$HScore consistently outperforms existing training-free baselines.
Abstract（参考訳）: 大規模言語モデル(LLM)は目覚ましい成功を収めているが、その実践的応用は「幻覚(hallucination)」と呼ばれる非現実的コンテンツの生成によって妨げられることが多い。 LLMのアウトプットの信頼性を確保することは、特に金融、セキュリティ、医療といった高度な領域において重要な課題である。本研究では,モデルアーキテクチャと生成ダイナミクスの観点から幻覚検出を再考する。 LLMの多層構造と自己回帰復号プロセスを利用して、幻覚信号を2つの相補的な次元に分解する。この知見に基づいて,(1)各層におけるトークン表現の意味的多様性を定量化する,(2)各層間のキートークン表現のプログレッシブな変換を追跡する,トレーニングフリーかつラベルフリーなフレームワークである。ドリフトがノイズや冗長なトークンではなく意味的意味論の進化を反映することを保証するため、注意信号を用いたトークン選択をガイドする。 D$^2$HScoreは、推論中に水平と垂直の両方の表現をキャプチャすることで、幻覚検出のための解釈可能で軽量なプロキシを提供する。 5つのオープンソースLLMと5つの広く使用されているベンチマークによる大規模な実験は、D$^2$HScoreが既存のトレーニング不要のベースラインを一貫して上回っていることを示している。

論文の概要: D$^2$HScore: Reasoning-Aware Hallucination Detection via Semantic Breadth and Depth Analysis in LLMs

関連論文リスト