Fugu-MT 論文翻訳(概要): Brittleness and Promise: Knowledge Graph Based Reward Modeling for Diagnostic Reasoning

論文の概要: Brittleness and Promise: Knowledge Graph Based Reward Modeling for Diagnostic Reasoning

arxiv url: http://arxiv.org/abs/2509.18316v1
Date: Mon, 22 Sep 2025 18:39:09 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-24 20:41:27.529076
Title: Brittleness and Promise: Knowledge Graph Based Reward Modeling for Diagnostic Reasoning
Title（参考訳）: 脆さと約束:診断推論のための知識グラフに基づくリワードモデリング
Authors: Saksham Khatwani, He Cheng, Majid Afshar, Dmitriy Dligach, Yanjun Gao,
Abstract要約: 大型言語モデル (LLM) は診断的推論を約束するが、しばしば信頼できる知識に基づく推論を欠いている。本研究は,候補経路が患者入力の正しい診断につながるかどうかを判断するために学習するKG推論経路の報奨モデルとしてLLMを取り扱う。臨床KGに対する「リワードモデル」推論の体系的評価を初めて行った。
参考スコア（独自算出の注目度）: 8.35131510062609
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Large language models (LLMs) show promise for diagnostic reasoning but often lack reliable, knowledge grounded inference. Knowledge graphs (KGs), such as the Unified Medical Language System (UMLS), offer structured biomedical knowledge that can support trustworthy reasoning. Prior approaches typically integrate KGs via retrieval augmented generation or fine tuning, inserting KG content into prompts rather than enabling structured reasoning. We explore an alternative paradigm: treating the LLM as a reward model of KG reasoning paths, where the model learns to judge whether a candidate path leads to correct diagnosis for a given patient input. This approach is inspired by recent work that leverages reward training to enhance model reasoning abilities, and grounded in computational theory, which suggests that verifying a solution is often easier than generating one from scratch. It also parallels physicians' diagnostic assessment, where they judge which sequences of findings and intermediate conditions most plausibly support a diagnosis. We first systematically evaluate five task formulation for knowledge path judging and eight training paradigm. Second, we test whether the path judging abilities generalize to downstream diagnostic tasks, including diagnosis summarization and medical question answering. Experiments with three open source instruct-tuned LLMs reveal both promise and brittleness: while specific reward optimization and distillation lead to strong path-judging performance, the transferability to downstream tasks remain weak. Our finding provides the first systematic assessment of "reward model style" reasoning over clinical KGs, offering insights into how structured, reward-based supervision influences diagnostic reasoning in GenAI systems for healthcare.
Abstract（参考訳）: 大規模言語モデル(LLM)は診断的推論を約束するが、しばしば信頼できる知識に基づく推論を欠いている。 UMLS(Unified Medical Language System)のような知識グラフ(KG)は、信頼できる推論を支援する構造化された生体医学的知識を提供する。従来のアプローチでは、構造的推論を可能とせず、KGコンテンツをプロンプトに挿入することで、検索の強化や微調整を通じてKGを統合するのが一般的であった。 KG推論経路の報酬モデルとしてLLMを扱い、候補経路が患者入力の正しい診断につながるかどうかをモデルが学習する。このアプローチは、モデル推論能力を高めるために報酬トレーニングを活用する最近の研究に触発され、計算理論に基礎を置いている。また、医師の診断評価と平行して、どの所見のシーケンスと中間状態が診断を最も確実に支援するかを判断する。まず,知識経路判断のための5つのタスクの定式化と8つの訓練パラダイムを体系的に評価した。第2に、診断要約や医療質問応答など、下流の診断タスクに経路判定能力が一般化するかどうかを検証する。特定の報酬の最適化と蒸留は、強い経路調整性能をもたらすが、下流タスクへの転送性は弱いままである。我々の発見は、臨床用KGに対する「リワードモデルスタイル」推論の体系的評価を初めて提供し、医療用GenAIシステムにおいて、構造化された報酬ベースの監督が診断推論にどのように影響するかについての洞察を提供する。

関連論文リスト

End-to-End Agentic RAG System Training for Traceable Diagnostic Reasoning [52.12425911708585]
Deep-DxSearchは、強化学習(RL)でエンドツーエンドに訓練されたエージェントRAGシステムである。 Deep-DxSearchでは,患者記録と信頼性のある医療知識情報を含む大規模医療検索コーパスを構築した。実験により、エンドツーエンドのRLトレーニングフレームワークは、プロンプトエンジニアリングやトレーニングフリーなRAGアプローチよりも一貫して優れています。
論文参考訳（メタデータ） (2025-08-21T17:42:47Z)
Medical Reasoning in the Era of LLMs: A Systematic Review of Enhancement Techniques and Applications [59.721265428780946]
医学における大きな言語モデル(LLM)は印象的な能力を実現しているが、体系的で透明で検証可能な推論を行う能力に重大なギャップが残っている。本稿は、この新興分野に関する最初の体系的なレビューを提供する。本稿では,学習時間戦略とテスト時間メカニズムに分類した推論強化手法の分類法を提案する。
論文参考訳（メタデータ） (2025-08-01T14:41:31Z)
KERAP: A Knowledge-Enhanced Reasoning Approach for Accurate Zero-shot Diagnosis Prediction Using Multi-agent LLMs [39.47350988195002]
大きな言語モデル(LLM)は、診断予測に言語能力と生物医学的知識を活用することを約束している。我々は,知識グラフ(KG)を用いた多エージェントアーキテクチャによるLLMに基づく診断予測を改善する推論手法であるKERAPを提案する。本フレームワークは, マッピング用リンクエージェント, 構造化知識抽出用検索エージェント, 診断予測を反復的に洗練する予測エージェントから構成される。
論文参考訳（メタデータ） (2025-07-03T16:35:11Z)
Bridging Stepwise Lab-Informed Pretraining and Knowledge-Guided Learning for Diagnostic Reasoning [20.369746122143063]
本稿では,2つの相補的な情報ソースを結合した2元検定フレームワークを提案する。外部知識のために,大規模モデルによって強化された階層的言語と意味的関係をエンコードする診断知識グラフ(KG)を構築した。そこで本研究では,臨床検査信号に基づく段階的推論プロセスに従ってモデルを誘導する,ラボインフォームド・プロキシータスクを提案する。
論文参考訳（メタデータ） (2024-10-25T20:25:22Z)
Uncertainty-aware Medical Diagnostic Phrase Identification and Grounding [72.18719355481052]
MRG(Messical Report Grounding)と呼ばれる新しい課題について紹介する。 MRGは医療報告から診断フレーズとその対応する接地箱を直接エンドツーエンドで識別することを目的としている。マルチモーダルな大規模言語モデルを用いて診断フレーズを予測する,堅牢で信頼性の高いフレームワークである uMedGround を提案する。
論文参考訳（メタデータ） (2024-04-10T07:41:35Z)
Towards the Identifiability and Explainability for Personalized Learner Modeling: An Inductive Paradigm [36.60917255464867]
本稿では,エンコーダ・デコーダモデルにインスパイアされた新しい応答効率応答パラダイムに基づく,識別可能な認知診断フレームワークを提案する。診断精度を損なうことなく,ID-CDFが効果的に対処できることが示唆された。
論文参考訳（メタデータ） (2023-09-01T07:18:02Z)
Leveraging Medical Knowledge Graphs Into Large Language Models for Diagnosis Prediction: Design and Application Study [6.10474409373543]
自動診断におけるLarge Language Models (LLMs) の習熟度を高めるための革新的なアプローチを提案する。我々は,国立医科大学統一医療言語システム(UMLS)からKGを抽出した。我々のアプローチは説明可能な診断経路を提供し、AIによる診断決定支援システムの実現に近づいている。
論文参考訳（メタデータ） (2023-08-28T06:05:18Z)
NeuralSympCheck: A Symptom Checking and Disease Diagnostic Neural Model with Logic Regularization [59.15047491202254]
症状検査システムは、患者に症状を問い合わせ、迅速で手頃な価格の医療評価を行う。本稿では,論理正則化を用いたニューラルネットワークの教師付き学習に基づく新しい手法を提案する。以上の結果から,本手法は診断回数や症状が大きい場合の診断精度において,最も優れた方法であることがわかった。
論文参考訳（メタデータ） (2022-06-02T07:57:17Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。