Fugu-MT 論文翻訳(概要): Text Knows What, Tables Know When: Clinical Timeline Reconstruction via Retrieval-Augmented Multimodal Alignment

論文の概要: Text Knows What, Tables Know When: Clinical Timeline Reconstruction via Retrieval-Augmented Multimodal Alignment

arxiv url: http://arxiv.org/abs/2605.15168v1
Date: Thu, 14 May 2026 17:55:27 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-15 21:45:35.003075
Title: Text Knows What, Tables Know When: Clinical Timeline Reconstruction via Retrieval-Augmented Multimodal Alignment
Title（参考訳）: 検索型マルチモーダルアライメントによる臨床タイムライン再構築
Authors: Sayantan Kumar, Shahriar Noroozizadeh, Juyong Kim, Jeremy C. Weiss,
Abstract要約: 本稿では,テキストから抽出した絶対的臨床タイムラインの時間的精度を向上させるために,検索強化型マルチモーダルアライメントフレームワークを提案する。提案手法は,グラフベースのマルチステッププロセスとしてタイムライン再構成を定式化する。
参考スコア（独自算出の注目度）: 4.42383617731229
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reconstructing precise clinical timelines is essential for modeling patient trajectories and forecasting risk in complex, heterogeneous conditions like sepsis. While unstructured clinical narratives offer semantically rich and contextually complete descriptions of a patient's course, they often lack temporal precision and contain ambiguous event timing. Conversely, structured electronic health record (EHR) data provides precise temporal anchors but misses a substantial portion of clinically meaningful events. We introduce a retrieval-augmented multimodal alignment framework that bridges this gap to improve the temporal precision of absolute clinical timelines extracted from text. Our approach formulates timeline reconstruction as a graph-based multistep process: it first extracts central anchor events from narratives to build an initial temporal scaffold, places non-central events relative to this backbone, and then calibrates the timeline using retrieved structured EHR rows as external temporal evidence. Evaluated using instruction-tuned large language models on the i2m4 benchmark spanning MIMIC-III and MIMIC-IV, our multimodal pipeline consistently improves absolute timestamp accuracy (AULTC) and improves temporal concordance across nearly all evaluated models over unimodal text-only reconstruction, without compromising event match rates. Furthermore, our empirical gap analysis reveals that 34.8% of text-derived events are entirely absent from tabular records, demonstrating that aligning these modalities can produce a more temporally faithful and clinically informative reconstruction of patient trajectories than either source alone.
Abstract（参考訳）: 正確な臨床スケジュールの再構築は、患者の軌跡をモデル化し、敗血症のような複雑で不均一な状況におけるリスクを予測するために不可欠である。構造化されていない臨床物語は、意味的に豊かで文脈的に患者の経過の完全な記述を提供するが、時間的精度が欠如しており、不明瞭な事象のタイミングを含んでいることが多い。逆に、構造化された電子健康記録(EHR)データは正確な時間的アンカーを提供するが、臨床的に意味のある事象のかなりの部分を見逃している。本稿では,テキストから抽出した絶対的臨床タイムラインの時間的精度を向上させるために,このギャップを埋める多モードアライメントフレームワークを提案する。提案手法は,まず物語から中央アンカーイベントを抽出して初期時間的足場を構築し,このバックボーンに対して非中央イベントを配置し,抽出した構造化ERH列を外部時間的証拠としてタイムラインを校正する。 MIMIC-IIIとMIMIC-IVにまたがるi2m4ベンチマークの命令調整された大規模言語モデルを用いて評価し、我々のマルチモーダルパイプラインは絶対タイムスタンプ精度(AULTC)を一貫して改善し、イベントマッチング率を損なうことなく、ほぼすべての評価モデル間の時間的一致を改善する。さらに, 実験的ギャップ分析の結果, テキスト由来の事象の34.8%が表層記録から完全に欠落していることが判明した。

論文の概要: Text Knows What, Tables Know When: Clinical Timeline Reconstruction via Retrieval-Augmented Multimodal Alignment

関連論文リスト