Fugu-MT 論文翻訳(概要): Hallucination Detection-Guided Preference Optimization for Clinical Summarization

論文の概要: Hallucination Detection-Guided Preference Optimization for Clinical Summarization

arxiv url: http://arxiv.org/abs/2605.28910v2
Date: Mon, 01 Jun 2026 13:38:20 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-02 18:24:16.640603
Title: Hallucination Detection-Guided Preference Optimization for Clinical Summarization
Title（参考訳）: 幻覚検出ガイドを用いた臨床要約のための選好最適化
Authors: Shamanth Kuthpadi Seethakantha, Dung Ngoc Thai, Vara Prasad Gudi, Simran Tiwari, Rami Matar, Avijit Mitra, Wenlong Zhao, Andrew McCallum, Wael Salloum,
Abstract要約: 大規模言語モデル(LLM)は要約タスクを約束するが、幻覚を生成することが多い。幻覚検出器を用いて反復的なリビジョンを導出する推論時間法であるitermodelfull(itermodel)を導入する。そこで本研究では,検出器誘導精製軌道をモデルファインタニングのための選好ペアに変換する優先学習(Preference Learning, モデル)のイテラーモデルを提案する。
参考スコア（独自算出の注目度）: 16.75280184390529
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) have shown promise on summarization tasks, but they often produce hallucinations, which are unsupported or incorrect statements that limit their reliability in specialized healthcare applications. We introduce \itermodelfull (\itermodel), an inference-time method that leverages hallucination detectors to guide iterative summary revisions toward factual corrections. Building on this, we propose \itermodel for Preference Learning (\model), which converts detector-guided refinement trajectories into preference pairs for model finetuning. Extensive experiments show that our methods substantially reduce hallucinations for Llama and Gemma models in summarizing real-world clinical notes from \MimicIV. For example, \itermodel reduces 24\% and \model reduces 48\% hallucinations in Llama-3.1-8B-Instruct. Importantly, both methods preserve summary fluency, coherence, and relevance according to human expert and LLM-Jury evaluations. Together, these results demonstrate that detection-informed refinement and preference learning offer an automated solution for improving factual faithfulness in clinical summarization.
Abstract（参考訳）: 大規模言語モデル(LLM)は要約タスクを約束しているが、しばしば幻覚を生じさせる。本稿では,幻覚検出器を利用した実写修正に向けて反復的なリビジョンを導出する推論時間法である \itermodelfull (\itermodel) を紹介する。そこで本研究では,検出器誘導精製軌道をモデル微調整のための選好ペアに変換するPreference Learning (\model) のための‘itermodel for Preference Learning’を提案する。広汎な実験により,本手法は実際の臨床メモを<MimicIV>から要約する上で,LlamaモデルとGemmaモデルに対する幻覚を著しく低減することが示された。例えば、 \itermodel は 24\% を減少させ、 \model は Llama-3.1-8B-Instruct において 48\% の幻覚を減少させる。重要な点として、これらの手法は、人間の専門家およびLLM-Juryの評価に基づいて、要約流布、コヒーレンス、および関連性を保存している。これらの結果から,検出インフォームド・リファインメントと選好学習は,臨床要約における事実忠実性を改善するための自動解法であることが示された。

論文の概要: Hallucination Detection-Guided Preference Optimization for Clinical Summarization

関連論文リスト