Related papers: LLM-Assisted Emergency Triage Benchmark: Bridging Hospital-Rich and MCI-Like Field Simulation

LLM-Assisted Emergency Triage Benchmark: Bridging Hospital-Rich and MCI-Like Field Simulation

URL: http://arxiv.org/abs/2509.26351v1
Date: Tue, 30 Sep 2025 14:54:58 GMT
Title: LLM-Assisted Emergency Triage Benchmark: Bridging Hospital-Rich and MCI-Like Field Simulation
Authors: Joshua Sebastian, Karma Tobden, KMA Solaiman,
Abstract summary: We introduce an open, LLM-assisted emergency triage benchmark for deterioration prediction.<n>The benchmark defines two regimes: (i) a hospital-rich setting with vitals, labs, notes, chief complaints, and structured observations, and (ii) an MCI-like field simulation limited to vitals, observations, and notes.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Research on emergency and mass casualty incident (MCI) triage has been limited by the absence of openly usable, reproducible benchmarks. Yet these scenarios demand rapid identification of the patients most in need, where accurate deterioration prediction can guide timely interventions. While the MIMIC-IV-ED database is openly available to credentialed researchers, transforming it into a triage-focused benchmark requires extensive preprocessing, feature harmonization, and schema alignment -- barriers that restrict accessibility to only highly technical users. We address these gaps by first introducing an open, LLM-assisted emergency triage benchmark for deterioration prediction (ICU transfer, in-hospital mortality). The benchmark then defines two regimes: (i) a hospital-rich setting with vitals, labs, notes, chief complaints, and structured observations, and (ii) an MCI-like field simulation limited to vitals, observations, and notes. Large language models (LLMs) contributed directly to dataset construction by (i) harmonizing noisy fields such as AVPU and breathing devices, (ii) prioritizing clinically relevant vitals and labs, and (iii) guiding schema alignment and efficient merging of disparate tables. We further provide baseline models and SHAP-based interpretability analyses, illustrating predictive gaps between regimes and the features most critical for triage. Together, these contributions make triage prediction research more reproducible and accessible -- a step toward dataset democratization in clinical AI.

Related papers

Benchmarking Early Deterioration Prediction Across Hospital-Rich and MCI-Like Emergency Triage Under Constrained Sensing [0.0]
We present a leakage-aware benchmarking framework for early deterioration prediction.<n>We compare hospital-rich triage with a vitals-only, MCI-like setting, restricting inputs to information available within the first hour of presentation.
arXiv Detail & Related papers (2026-02-09T09:32:49Z)
Benchmarking Egocentric Clinical Intent Understanding Capability for Medical Multimodal Large Language Models [48.95516224614331]
We introduce MedGaze-Bench, the first benchmark leveraging clinician gaze as a Cognitive Cursor to assess intent understanding across surgery, emergency simulation, and diagnostic interpretation.<n>Our benchmark addresses three fundamental challenges: visual homogeneity of anatomical structures, strict temporal-causal dependencies in clinical, and implicit adherence to safety protocols.
arXiv Detail & Related papers (2026-01-11T02:20:40Z)
MedAlign: A Synergistic Framework of Multimodal Preference Optimization and Federated Meta-Cognitive Reasoning [52.064286116035134]
We develop MedAlign, a framework to ensure visually accurate LVLM responses for Medical Visual Question Answering (Med-VQA)<n>We first propose a multimodal Direct Preference Optimization (mDPO) objective to align preference learning with visual context.<n>We then design a Retrieval-Aware Mixture-of-Experts (RA-MoE) architecture that utilizes image and text similarity to route queries to a specialized and context-augmented LVLM.
arXiv Detail & Related papers (2025-10-24T02:11:05Z)
OncoReason: Structuring Clinical Reasoning in LLMs for Robust and Interpretable Survival Prediction [2.904892426557913]
Large language models (LLMs) have shown strong performance in biomedical NLP.<n>We present a unified, multi-task learning framework that aligns autoregressive LLMs with clinical reasoning for outcome prediction.<n>Our findings underscore the importance of reasoning-aware alignment in multi-task clinical modeling.
arXiv Detail & Related papers (2025-10-20T13:35:12Z)
Diffusion with a Linguistic Compass: Steering the Generation of Clinically Plausible Future sMRI Representations for Early MCI Conversion Prediction [13.937881108738042]
We propose a diffusion-based framework that synthesizes clinically plausible future sMRI representations directly from baseline data.<n>Experiments on ADNI and AIBL cohorts show that MCI-Diff outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2025-06-05T07:01:05Z)
Enhancing In-Hospital Mortality Prediction Using Multi-Representational Learning with LLM-Generated Expert Summaries [3.5508427067904864]
In-hospital mortality (IHM) prediction for ICU patients is critical for timely interventions and efficient resource allocation. This study integrates structured physiological data and clinical notes with Large Language Model (LLM)-generated expert summaries to improve IHM prediction accuracy.
arXiv Detail & Related papers (2024-11-25T16:36:38Z)
Prompting Large Language Models for Zero-Shot Clinical Prediction with Structured Longitudinal Electronic Health Record Data [7.815738943706123]
Large Language Models (LLMs) are traditionally tailored for natural language processing. This research investigates the adaptability of LLMs, like GPT-4, to EHR data. In response to the longitudinal, sparse, and knowledge-infused nature of EHR data, our prompting approach involves taking into account specific characteristics.
arXiv Detail & Related papers (2024-01-25T20:14:50Z)
XAI for In-hospital Mortality Prediction via Multimodal ICU Data [57.73357047856416]
We propose an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data. We employ multimodal learning in our framework, which can receive heterogeneous inputs from clinical data and make decisions. Our framework can be easily transferred to other clinical tasks, which facilitates the discovery of crucial factors in healthcare research.
arXiv Detail & Related papers (2023-12-29T14:28:04Z)
Clairvoyance: A Pipeline Toolkit for Medical Time Series [95.22483029602921]
Time-series learning is the bread and butter of data-driven *clinical decision support* Clairvoyance proposes a unified, end-to-end, autoML-friendly pipeline that serves as a software toolkit. Clairvoyance is the first to demonstrate viability of a comprehensive and automatable pipeline for clinical time-series ML.
arXiv Detail & Related papers (2023-10-28T12:08:03Z)
SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity Prediction [127.43571146741984]
Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery. wet experiments remain the most reliable method, but they are time-consuming and resource-intensive. Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue. We present the SSM-DTA framework, which incorporates three simple yet highly effective strategies.
arXiv Detail & Related papers (2022-06-20T14:53:25Z)
Self-supervised Answer Retrieval on Clinical Notes [68.87777592015402]
We introduce CAPR, a rule-based self-supervision objective for training Transformer language models for domain-specific passage matching. We apply our objective in four Transformer-based architectures: Contextual Document Vectors, Bi-, Poly- and Cross-encoders. We report that CAPR outperforms strong baselines in the retrieval of domain-specific passages and effectively generalizes across rule-based and human-labeled passages.
arXiv Detail & Related papers (2021-08-02T10:42:52Z)
MIA-Prognosis: A Deep Learning Framework to Predict Therapy Response [58.0291320452122]
This paper aims at a unified deep learning approach to predict patient prognosis and therapy response. We formalize the prognosis modeling as a multi-modal asynchronous time series classification task. Our predictive model could further stratify low-risk and high-risk patients in terms of long-term survival.
arXiv Detail & Related papers (2020-10-08T15:30:17Z)
BiteNet: Bidirectional Temporal Encoder Network to Predict Medical Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey. An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys. We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.