Fugu-MT 論文翻訳(概要): Joint Evaluation of Answer and Reasoning Consistency for Hallucination Detection in Large Reasoning Models

論文の概要: Joint Evaluation of Answer and Reasoning Consistency for Hallucination Detection in Large Reasoning Models

arxiv url: http://arxiv.org/abs/2506.04832v1
Date: Thu, 05 Jun 2025 09:54:04 GMT
ステータス: 翻訳完了
システム内更新日: 2025-06-06 21:53:49.645048
Title: Joint Evaluation of Answer and Reasoning Consistency for Hallucination Detection in Large Reasoning Models
Title（参考訳）: 大型共振モデルにおける幻覚検出のためのアンサーと共振整合性の評価
Authors: Changyue Wang, Weihang Su, Qingyao Ai, Yiqun Liu,
Abstract要約: トレースの推論は冗長あるいは論理的に矛盾する可能性があるため、新しい幻覚の源となる。既存の幻覚検出法は主に回答レベルの不確実性に焦点を当てている。 LRMにおける幻覚検出に適した新しいフレームワークであるRASを提案する。
参考スコア（独自算出の注目度）: 12.270274049887298
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Reasoning Models (LRMs) extend large language models with explicit, multi-step reasoning traces to enhance transparency and performance on complex tasks. However, these reasoning traces can be redundant or logically inconsistent, making them a new source of hallucination that is difficult to detect. Existing hallucination detection methods focus primarily on answer-level uncertainty and often fail to detect hallucinations or logical inconsistencies arising from the model's reasoning trace. This oversight is particularly problematic for LRMs, where the explicit thinking trace is not only an important support to the model's decision-making process but also a key source of potential hallucination. To this end, we propose RACE (Reasoning and Answer Consistency Evaluation), a novel framework specifically tailored for hallucination detection in LRMs. RACE operates by extracting essential reasoning steps and computing four diagnostic signals: inter-sample consistency of reasoning traces, entropy-based answer uncertainty, semantic alignment between reasoning and answers, and internal coherence of reasoning. This joint analysis enables fine-grained hallucination detection even when the final answer appears correct. Experiments across datasets and different LLMs demonstrate that RACE outperforms existing hallucination detection baselines, offering a robust and generalizable solution for evaluating LRMs. Our code is available at: https://github.com/bebr2/RACE.
Abstract（参考訳）: 大規模推論モデル(LRM)は、複雑なタスクにおける透明性とパフォーマンスを高めるために、明示的で多段階の推論トレースを持つ大きな言語モデルを拡張する。しかし、これらの推論トレースは冗長あるいは論理的に矛盾する可能性があるため、検出が難しい新しい幻覚の源となる。既存の幻覚検出法は主に解答レベルの不確実性に焦点を当てており、しばしばモデルの推論トレースから生じる幻覚や論理的不整合を検出することに失敗する。明確な思考トレースは、モデルの決定プロセスに対する重要なサポートであるだけでなく、潜在的幻覚の重要な源でもある。そこで本研究では,RAS(Reasoning and Answer Consistency Evaluation)を提案する。 RACEは基本的な推論ステップを抽出し、推論トレースのサンプル間一貫性、エントロピーベースの回答の不確実性、推論と回答間のセマンティックアライメント、推論の内部コヒーレンスという4つの診断信号を計算する。この共同分析により、最終回答が正しければ、きめ細かい幻覚検出が可能となる。データセットと異なるLLMにわたる実験により、RASは既存の幻覚検出ベースラインより優れており、LEMを評価する堅牢で一般化可能なソリューションを提供することが示された。私たちのコードは、https://github.com/bebr2/RACE.comで利用可能です。

論文の概要: Joint Evaluation of Answer and Reasoning Consistency for Hallucination Detection in Large Reasoning Models

関連論文リスト