Fugu-MT 論文翻訳(概要): DeepER-Med: Advancing Deep Evidence-Based Research in Medicine Through Agentic AI

論文の概要: DeepER-Med: Advancing Deep Evidence-Based Research in Medicine Through Agentic AI

arxiv url: http://arxiv.org/abs/2604.15456v1
Date: Thu, 16 Apr 2026 18:17:24 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-20 22:00:19.618844
Title: DeepER-Med: Advancing Deep Evidence-Based Research in Medicine Through Agentic AI
Title（参考訳）: DeepER-Med:エージェントAIによる医学の深いエビデンスベースの研究を促進する
Authors: Zhizheng Wang, Chih-Hsuan Wei, Joey Chan, Robert Leaman, Chi-Ping Day, Chuan Wu, Mark A Knepper, Antolin Serrano Farias, Jordina Rincon-Torroella, Hasan Slika, Betty Tyler, Ryan Huu-Tuan Nguyen, Asmita Indurkar, Mélanie Hébert, Shubo Tian, Lauren He, Noor Naffakh, Aseem Aseem, Nicholas Wan, Emily Y Chew, Tiarnan D L Keenan, Zhiyong Lu,
Abstract要約: 我々はエージェントAIシステムを用いた深層医学研究のためのフレームワークであるDeepER-Medを紹介する。 DeepER-Medは、エビデンスベースの生成の明示的で検査可能なワークフローとして、深層医学研究を基盤としている。複数の基準で広く使われているプロダクショングレードプラットフォームより一貫して優れています。ヒト臨床評価では、DeepER-Medの結論は7例の臨床勧告と一致している。
参考スコア（独自算出の注目度）: 10.310030966524161
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Trustworthiness and transparency are essential for the clinical adoption of artificial intelligence (AI) in healthcare and biomedical research. Recent deep research systems aim to accelerate evidence-grounded scientific discovery by integrating AI agents with multi-hop information retrieval, reasoning, and synthesis. However, most existing systems lack explicit and inspectable criteria for evidence appraisal, creating a risk of compounding errors and making it difficult for researchers and clinicians to assess the reliability of their outputs. In parallel, current benchmarking approaches rarely evaluate performance on complex, real-world medical questions. Here, we introduce DeepER-Med, a Deep Evidence-based Research framework for Medicine with an agentic AI system. DeepER-Med frames deep medical research as an explicit and inspectable workflow of evidence-based generation, consisting of three modules: research planning, agentic collaboration, and evidence synthesis. To support realistic evaluation, we also present DeepER-MedQA, an evidence-grounded dataset comprising 100 expert-level research questions derived from authentic medical research scenarios and curated by a multidisciplinary panel of 11 biomedical experts. Expert manual evaluation demonstrates that DeepER-Med consistently outperforms widely used production-grade platforms across multiple criteria, including the generation of novel scientific insights. We further demonstrate the practical utility of DeepER-Med through eight real-world clinical cases. Human clinician assessment indicates that DeepER-Med's conclusions align with clinical recommendations in seven cases, highlighting its potential for medical research and decision support.
Abstract（参考訳）: 信頼と透明性は、医療と生物医学研究における人工知能(AI)の臨床的採用に不可欠である。近年の深層研究システムは、AIエージェントとマルチホップ情報検索、推論、合成を統合することにより、エビデンスに基づく科学的発見を加速することを目的としている。しかし、既存のシステムの多くは、明確な検査可能な評価基準を欠いているため、エラーを複雑にし、研究者や臨床医がアウトプットの信頼性を評価するのを困難にしている。並行して、現在のベンチマークアプローチでは、複雑で現実的な医学的問題のパフォーマンスを評価することはめったにない。本稿では、エージェントAIシステムを備えたディープエビデンスベースの医学研究フレームワークであるDeepER-Medを紹介する。 DeepER-Medは、ディープ・メディカル・リサーチをエビデンス・ベース・ジェネレーションの明示的で検査可能なワークフローとして捉えており、研究計画、エージェント・コラボレーション、エビデンス・シンセサイザーという3つのモジュールから構成されている。また,現実的な評価を支援するために,11名のバイオメディカル専門家からなる多学際パネルで実施した,真正な医学研究シナリオから得られた100名の専門家レベル研究質問からなるエビデンスグラウンドデータセットDeepER-MedQAも提示した。専門家による手作業による評価では、DeepER-Medは、新しい科学的洞察の生成を含む、さまざまな基準で広く使用されているプロダクショングレードプラットフォームを一貫して上回っている。さらに,DepER-Medの実用性について,実例8例で検証した。ヒト臨床評価では、DeepER-Medの結論は7例の臨床勧告と一致しており、医学研究と意思決定支援の可能性を強調している。

論文の概要: DeepER-Med: Advancing Deep Evidence-Based Research in Medicine Through Agentic AI

関連論文リスト