Fugu-MT 論文翻訳(概要): Only Say What You Know: Calibration-Aware Generation for Long-Form Factuality

論文の概要: Only Say What You Know: Calibration-Aware Generation for Long-Form Factuality

arxiv url: http://arxiv.org/abs/2605.01749v1
Date: Sun, 03 May 2026 07:07:09 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-05 20:33:49.921515
Title: Only Say What You Know: Calibration-Aware Generation for Long-Form Factuality
Title（参考訳）: キャリブレーション・アウェア・ジェネレーション(Calibration-Aware Generation)
Authors: Wen Luo, Guangyue Peng, Liang Wang, Nan Yang, Wei Li, Yuhan Song, Shaohang Wei, Feifan Song, Furu Wei, Houfeng Wang,
Abstract要約: 大規模な推論モデルは複雑なタスクで強いパフォーマンスを達成するが、幻覚に苦しむ傾向にある。最終コミットメントから知識探索を阻害するTextbfExploration-Commitment Decouplingパラダイムを提案する。 textbfCalibration-Aware Generation(CAG)は、エンド・ツー・エンドのキャリブレーション・アウェア・ジェネレーション機能を備えたモデルを提供するフレームワークである。
参考スコア（独自算出の注目度）: 64.83776302639727
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Reasoning Models achieve strong performance on complex tasks but remain prone to hallucinations, particularly in long-form generation where errors compound across reasoning steps. Existing approaches to improving factuality, including abstention and factuality-driven optimization, follow a \emph{coupled exploration-commitment} paradigm, in which intermediate reasoning is unconditionally propagated to the final output, limiting fine-grained control over information selection and integration. In this paper, we propose an \textbf{Exploration-Commitment Decoupling} paradigm that disentangles knowledge exploration from final commitment, enabling models to explore with awareness while answering cautiously. We instantiate the paradigm with \textbf{Calibration-Aware Generation (CAG)}, a framework that equips models with end-to-end, calibration-aware generation capabilities, by augmenting intermediate reasoning with calibrated reliability estimates and prioritizing reliable content in final outputs. Across five long-form factuality benchmarks and multiple model families, CAG improves factuality by up to 13%, while reducing decoding time by up to 37%. Overall, our work highlights decoupling as a principled approach for more reliable long-form generation, offering directions for trustworthy and self-aware generative systems.
Abstract（参考訳）: 大規模な推論モデルは複雑なタスクにおいて高いパフォーマンスを達成するが、特に推論ステップにまたがるエラーが混在する長文生成において、幻覚を起こす傾向にある。実写性を改善するための既存のアプローチは、情報選択と統合のきめ細かい制御を制限し、中間的推論が最終的な出力に無条件で伝播されるような「emph{coupled Explor-commitment}」パラダイムに従っている。本稿では,知識探索を最終コミットメントから切り離し,慎重な回答をしながら意識的に探索できる「textbf{Exploration-Commitment Decoupling」パラダイムを提案する。このパラダイムを,キャリブレーション・アウェア・ジェネレーション(CAG)という,エンド・ツー・エンドのキャリブレーション・アウェア・ジェネレーション(キャリブレーション・アウェア・ジェネレーション)機能を備えたフレームワークを用いて,キャリブレーション・アウェア・ジェネレーション(CAG)を用いてインスタンス化する。 5つの長期の事実性ベンチマークと複数のモデルファミリで、CAGはデコード時間を最大37%削減しながら、事実性を改善する。全体として、我々の研究は、より信頼性の高いロングフォーム生成のための原則的なアプローチとしてデカップリングを強調しており、信頼できる自己認識型生成システムへの方向性を提供しています。

論文の概要: Only Say What You Know: Calibration-Aware Generation for Long-Form Factuality

関連論文リスト