Fugu-MT 論文翻訳(概要): Are LLMs Ready to Assist Physicians? PhysAssistBench for Interactive Doctor-Patient-EHR Assistance

論文の概要: Are LLMs Ready to Assist Physicians? PhysAssistBench for Interactive Doctor-Patient-EHR Assistance

arxiv url: http://arxiv.org/abs/2606.18613v2
Date: Thu, 18 Jun 2026 12:19:46 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-19 13:55:51.806333
Title: Are LLMs Ready to Assist Physicians? PhysAssistBench for Interactive Doctor-Patient-EHR Assistance
Title（参考訳）: LLMは医師を援助する準備が整っているか? : 医師とEHRの対話的支援のためのPhysAssistBench
Authors: Tianming Du, Peijie Yu, Sihan Shang, Danli Shi, My Linh Nguyen, Shengbo Gao, Guangyuan Li, Yinghong Yu, Yan Jiang, Qianlong Zhao, Behzad Bozorgtabar, Shaoxiong Ji, Jiazhen Pan, Daniel Rueckert, Jiancheng Yang,
Abstract要約: PhysAssistBenchは、インタラクティブな医師・患者・EHR支援のためのベンチマークである。実際のMIMIC-IVケースから構築されたPhysAssistBenchは、スケーラブルなパイプラインを使用して、エージェント患者を構築する。 PhysAssistBenchは、手動でレビューされた1,296回と医師公認のターンのキュレートされた評価セットを提供する。
参考スコア（独自算出の注目度）: 35.00060523691334
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The most plausible near-term role of medical LLMs is to assist rather than replace physicians, yet current evaluations often test isolated capabilities: clinical knowledge, EHR system interaction, or patient communication. Physician assistance instead requires coordinating these capabilities within the same interaction, where physicians issue underspecified requests, patients describe symptoms ambiguously, and EHR systems demand precise tool use. We introduce PhysAssistBench, a benchmark for interactive doctor-patient-EHR assistance. Built from real MIMIC-IV cases, PhysAssistBench uses a scalable pipeline to construct agentic patients: interactive, record-grounded agents that turn static EHR records into multi-turn clinical scenarios while preserving clinical factuality. PhysAssistBench provides a curated bilingual evaluation set of 1,296 manually reviewed and physician-validated turns. Experiments with leading LLMs show that current models remain unreliable in this setting, which exposes a key bottleneck for clinical LLMs: reliable assistance requires coordination across knowledge, communication, and systems, not isolated gains in any of them.
Abstract（参考訳）: 医学 LLM の最も確実な短期的役割は、医師の代わりに支援することであるが、現在の評価では、臨床知識、ERHシステムインタラクション、患者とのコミュニケーションといった孤立した能力をテストすることがしばしばある。医師は、医師が不明確な要求を発行し、患者は症状をあいまいに説明し、EHRシステムは正確なツールの使用を要求する。 PhysAssistBenchは、インタラクティブな医師・患者・EHR支援のためのベンチマークである。 PhysAssistBenchはMIMIC-IVの実際のケースから構築され、スケーラブルなパイプラインを使用してエージェント患者を構築する。 PhysAssistBenchは、手動でレビューした1,296のバイリンガル評価セットと医師公認のターンを提供する。信頼できる補助には知識、コミュニケーション、システム間の協調が必要であるが、いずれのモデルにおいても孤立した利益は得られない。

論文の概要: Are LLMs Ready to Assist Physicians? PhysAssistBench for Interactive Doctor-Patient-EHR Assistance

関連論文リスト