Fugu-MT 論文翻訳(概要): A multimodal and temporal foundation model for virtual patient representations at healthcare system scale

論文の概要: A multimodal and temporal foundation model for virtual patient representations at healthcare system scale

arxiv url: http://arxiv.org/abs/2604.18570v2
Date: Tue, 21 Apr 2026 21:55:35 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-23 15:36:10.440969
Title: A multimodal and temporal foundation model for virtual patient representations at healthcare system scale
Title（参考訳）: 医療システムスケールにおける仮想患者表現のためのマルチモーダル・時間的基礎モデル
Authors: Andrew Zhang, Tong Ding, Sophia J. Wagner, Caiwei Tian, Ming Y. Lu, Rowland Pettit, Joshua E. Lewis, Alexandre Misrahi, Dandan Mo, Long Phi Le, Faisal Mahmood,
Abstract要約: 我々は,30年以上にわたる長期入院記録をトレーニングし,評価したマルチモーダル時間基盤モデルであるApolloを紹介した。 Apollo は,臨床用語に 1 万以上のユニークな医療イベントを統合する統一表現空間を学習している。モデル予測は, 臨床的に解釈可能なマルチモーダルバイオマーカーと一致している。
参考スコア（独自算出の注目度）: 36.24986458898116
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Modern medicine generates vast multimodal data across siloed systems, yet no existing model integrates the full breadth and temporal depth of the clinical record into a unified patient representation. We introduce Apollo, a multimodal temporal foundation model trained and evaluated on over three decades of longitudinal hospital records from a major US hospital system, composed of 25 billion records from 7.2 million patients, representing 28 distinct medical modalities and 12 major medical specialties. Apollo learns a unified representation space integrating over 100 thousand unique medical events in our clinical vocabulary as well as images and clinical text. This "atlas of medical concepts" forms a computational substrate for modeling entire patient care journeys comprised of sequences of structured and unstructured events, which are compressed by Apollo into virtual patient representations. To assess the potential of these whole-patient representations, we created 322 prognosis and retrieval tasks from a held-out test set of 1.4 million patients. We demonstrate the generalized clinical forecasting potential of Apollo embeddings, including predicting new disease onset risk up to five years in advance (95 tasks), disease progression (78 tasks), treatment response (59 tasks), risk of treatment-related adverse events (17 tasks), and hospital operations endpoints (12 tasks). Using feature attribution techniques, we show that model predictions align with clinically-interpretable multimodal biomarkers. We evaluate semantic similarity search on 61 retrieval tasks, and moreover demonstrate the potential of Apollo as a multimodal medical search engine using text and image queries. Together, these modeling capabilities establish the foundation for computable medicine, where the full context of patient care becomes accessible to computational reasoning.
Abstract（参考訳）: 現代医学はサイロ化されたシステムにまたがって膨大なマルチモーダルデータを生成するが、臨床記録の完全な幅と時間深度を統一された患者表現に統合する既存のモデルは存在しない。我々は,米国の主要病院システムから30年以上にわたる長期入院記録をトレーニングし,評価したマルチモーダル・テンポラル・ファンデーション・モデルであるApolloを紹介した。 Apolloは、画像や臨床テキストだけでなく、臨床用語に1万以上のユニークな医療イベントを統合する、統一された表現空間を学びます。この「医療概念のアトラス」は、Apolloによって圧縮され仮想的な患者表現に変換される構造化されたイベントと非構造化イベントのシーケンスからなる、患者のケアジャーニー全体をモデル化するための計算基板を形成する。患者全体の表現の可能性を評価するため,140万人の患者を対象に,322の予後と検索タスクを作成した。今後5年間(95タスク), 疾患進行(78タスク), 治療応答(59タスク), 治療関連有害事象のリスク(17タスク), 病院手術エンドポイント(12タスク), など,アポロ埋め込みの一般的な臨床予測可能性を示す。特徴帰属技術を用いて, モデル予測が臨床的に解釈可能なマルチモーダルバイオマーカーと一致することを示す。 61の検索タスクにおける意味的類似性検索の評価を行い,テキストクエリと画像クエリを用いたマルチモーダルな医用検索エンジンとしてApolloの可能性を実証した。これらのモデリング能力は、計算可能な医療の基盤を確立し、患者ケアの完全なコンテキストが計算的推論にアクセスできるようになる。

論文の概要: A multimodal and temporal foundation model for virtual patient representations at healthcare system scale

関連論文リスト