Related papers: TRACE: Temporal Reasoning via Agentic Context Evolution for Streaming Electronic Health Records (EHRs)

TRACE: Temporal Reasoning via Agentic Context Evolution for Streaming Electronic Health Records (EHRs)

URL: http://arxiv.org/abs/2602.12833v1
Date: Fri, 13 Feb 2026 11:39:19 GMT
Title: TRACE: Temporal Reasoning via Agentic Context Evolution for Streaming Electronic Health Records (EHRs)
Authors: Zhan Qu, Michael Färber,
Abstract summary: Large Language Models (LLMs) encode extensive medical knowledge but struggle to apply it reliably to longitudinal patient trajectories.<n>We introduce TRACE, a framework that enables temporal clinical reasoning with frozen LLMs.<n> evaluated on longitudinal clinical event streams from MIMIC-IV.
Score: 7.2159153945746795
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Large Language Models (LLMs) encode extensive medical knowledge but struggle to apply it reliably to longitudinal patient trajectories, where evolving clinical states, irregular timing, and heterogeneous events degrade performance over time. Existing adaptation strategies rely on fine-tuning or retrieval-based augmentation, which introduce computational overhead, privacy constraints, or instability under long contexts. We introduce TRACE (Temporal Reasoning via Agentic Context Evolution), a framework that enables temporal clinical reasoning with frozen LLMs by explicitly structuring and maintaining context rather than extending context windows or updating parameters. TRACE operates over a dual-memory architecture consisting of a static Global Protocol encoding institutional clinical rules and a dynamic Individual Protocol tracking patient-specific state. Four agentic components, Router, Reasoner, Auditor, and Steward, coordinate over this structured memory to support temporal inference and state evolution. The framework maintains bounded inference cost via structured state compression and selectively audits safety-critical clinical decisions. Evaluated on longitudinal clinical event streams from MIMIC-IV, TRACE significantly improves next-event prediction accuracy, protocol adherence, and clinical safety over long-context and retrieval-augmented baselines, while producing interpretable and auditable reasoning traces.

Related papers

A Multi-Agent Framework for Interpreting Multivariate Physiological Time Series [9.72130666902599]
We present Vivaldi, a role-structured multi-agent system that explains multivariate physiological time series.<n>Our experiments show that agentic pipelines substantially benefit non-thinking and medically fine-tuned models.<n>We find that explicit tool-based computation is decisive for codifiable clinical metrics, whereas subjective targets, such as pain scores and length of stay, show limited or inconsistent changes.
arXiv Detail & Related papers (2026-03-04T14:55:46Z)
Continuous Telemonitoring of Heart Failure using Personalised Speech Dynamics [17.682803546212824]
We propose a Longitudinal Intra-Patient Tracking (LIPT) scheme to capture the trajectory of relative symptomatic changes within individuals.<n>Central to this framework is a Sequential Personalised (PSE) which transforms longitudinal speech recordings into context-aware latent representations.<n> Experimental results from a cohort of 225 patients demonstrate that the LIPT paradigm significantly outperforms the classic cross-sectional approaches.
arXiv Detail & Related papers (2026-02-23T10:19:17Z)
AgentsEval: Clinically Faithful Evaluation of Medical Imaging Reports via Multi-Agent Reasoning [73.50200033931148]
We introduce AgentsEval, a multi-agent stream reasoning framework that emulates the collaborative diagnostic workflow of radiologists.<n>By dividing the evaluation process into interpretable steps including criteria definition, evidence extraction, alignment, and consistency scoring, AgentsEval provides explicit reasoning traces and structured clinical feedback.<n> Experimental results demonstrate that AgentsEval delivers clinically aligned, semantically faithful, and interpretable evaluations that remain robust under paraphrastic, semantic, and stylistic perturbations.
arXiv Detail & Related papers (2026-01-23T11:59:13Z)
Benchmarking Egocentric Clinical Intent Understanding Capability for Medical Multimodal Large Language Models [48.95516224614331]
We introduce MedGaze-Bench, the first benchmark leveraging clinician gaze as a Cognitive Cursor to assess intent understanding across surgery, emergency simulation, and diagnostic interpretation.<n>Our benchmark addresses three fundamental challenges: visual homogeneity of anatomical structures, strict temporal-causal dependencies in clinical, and implicit adherence to safety protocols.
arXiv Detail & Related papers (2026-01-11T02:20:40Z)
EHRSummarizer: A Privacy-Aware, FHIR-Native Architecture for Structured Clinical Summarization of Electronic Health Records [0.0]
EHRSummarizer produces structured summaries to support structured chart review.<n>System can be configured for data minimization, stateless processing, and flexible deployment.
arXiv Detail & Related papers (2026-01-04T21:10:42Z)
Scan-do Attitude: Towards Autonomous CT Protocol Management using a Large Language Model Agent [39.72587188702086]
Large Language Model (LLM)-based agent framework is proposed to assist with the interpretation and execution of protocol configuration requests.<n>The agent combines in-context-learning, instruction-following, and structured toolcalling abilities to identify relevant protocol elements and apply accurate modifications.
arXiv Detail & Related papers (2025-09-24T16:04:11Z)
Technical Report: Facilitating the Adoption of Causal Inference Methods Through LLM-Empowered Co-Pilot [44.336297829718795]
We introduce CATE-B, an open-source co-pilot system that uses large language models (LLMs) within an agentic framework to guide users through treatment effect estimation.<n>CATE-B assists in (i) constructing a structural causal model via causal discovery and LLM-based edge orientation, (ii) identifying robust adjustment sets through a novel Minimal Uncertainty Adjustment Set criterion, and (iii) selecting appropriate regression methods tailored to the causal structure and dataset characteristics.
arXiv Detail & Related papers (2025-08-14T12:20:51Z)
METHOD: Modular Efficient Transformer for Health Outcome Discovery [0.25112747242081457]
This paper introduces METHOD, a novel transformer architecture specifically designed to address the challenges of clinical sequence modelling in electronic health records.<n>METHODintegrates three key innovations: (1) a patient-aware attention mechanism that prevents information leakage whilst enabling efficient batch processing; (2) an adaptive sliding window attention scheme that captures multi-scale temporal dependencies; and (3) a U-Net inspired architecture with dynamic skip connections for effective long sequence processing.<n> Evaluations on the MIMIC-IV database demonstrate that METHODconsistently outperforms the state-of-the-art ETHOSmodel
arXiv Detail & Related papers (2025-05-16T15:52:56Z)
Contrastive Representation Learning Helps Cross-institutional Knowledge Transfer: A Study in Pediatric Ventilation Management [7.066702592883538]
We present a systematic framework for cross-institutional knowledge transfer in clinical time series.<n>We investigate how different data regimes and fine-tuning strategies affect knowledge transfer across institutional boundaries.<n>Our work provides insights for developing more generalizable clinical decision support systems while enabling smaller specialized units to leverage knowledge from larger centers.
arXiv Detail & Related papers (2025-01-23T11:55:13Z)
Detecting Neurocognitive Disorders through Analyses of Topic Evolution and Cross-modal Consistency in Visual-Stimulated Narratives [83.15653194899126]
Early detection of neurocognitive disorders (NCDs) is crucial for timely intervention and disease management.<n>Current VSN-based NCD detection methods primarily focus on linguistic microstructures closely tied to bottom-up, stimulus-driven cognitive processes.<n>We propose two novel macrostructural approaches: a Dynamic Topic Model (DTM) to track topic evolution over time, and a Text-Image Temporal Alignment Network (TITAN) to measure cross-modal consistency between narrative and visual stimuli.
arXiv Detail & Related papers (2025-01-07T12:16:26Z)
Beyond One-Time Validation: A Framework for Adaptive Validation of Prognostic and Diagnostic AI-based Medical Devices [55.319842359034546]
Existing approaches often fall short in addressing the complexity of practically deploying these devices. The presented framework emphasizes the importance of repeating validation and fine-tuning during deployment. It is positioned within the current US and EU regulatory landscapes.
arXiv Detail & Related papers (2024-09-07T11:13:52Z)
Clairvoyance: A Pipeline Toolkit for Medical Time Series [95.22483029602921]
Time-series learning is the bread and butter of data-driven *clinical decision support* Clairvoyance proposes a unified, end-to-end, autoML-friendly pipeline that serves as a software toolkit. Clairvoyance is the first to demonstrate viability of a comprehensive and automatable pipeline for clinical time-series ML.
arXiv Detail & Related papers (2023-10-28T12:08:03Z)
Clinical Temporal Relation Extraction with Probabilistic Soft Logic Regularization and Global Inference [50.029659413650194]
Existing methods either require expensive feature engineering or are incapable of modeling the global dependencies among the events. In this paper, we propose a novel method, Clinical Temporal ReLation Exaction with Probabilistic Soft Logic Regularization and Global Inference.
arXiv Detail & Related papers (2020-12-16T08:23:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.