Related papers: A Narrative-Driven Computational Framework for Clinician Burnout Surveillance

A Narrative-Driven Computational Framework for Clinician Burnout Surveillance

URL: http://arxiv.org/abs/2509.04497v1
Date: Mon, 01 Sep 2025 19:05:26 GMT
Title: A Narrative-Driven Computational Framework for Clinician Burnout Surveillance
Authors: Syed Ahmad Chan Bukhari, Fazel Keshtkar, Alyssa Meczkowska,
Abstract summary: Clinician burnout poses a substantial threat to patient safety, particularly in high-acuity intensive care units (ICUs)<n>In this study, we analyze 10,000 ICU discharge summaries from MIMIC-IV, a publicly available database derived from the electronic health records of Beth Israel Deaconess Medical Center.<n>The dataset encompasses diverse patient data, including vital signs, medical orders, diagnoses, procedures, treatments, and deidentified free-text clinical notes.
Score: 0.5281694565226512
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Clinician burnout poses a substantial threat to patient safety, particularly in high-acuity intensive care units (ICUs). Existing research predominantly relies on retrospective survey tools or broad electronic health record (EHR) metadata, often overlooking the valuable narrative information embedded in clinical notes. In this study, we analyze 10,000 ICU discharge summaries from MIMIC-IV, a publicly available database derived from the electronic health records of Beth Israel Deaconess Medical Center. The dataset encompasses diverse patient data, including vital signs, medical orders, diagnoses, procedures, treatments, and deidentified free-text clinical notes. We introduce a hybrid pipeline that combines BioBERT sentiment embeddings fine-tuned for clinical narratives, a lexical stress lexicon tailored for clinician burnout surveillance, and five-topic latent Dirichlet allocation (LDA) with workload proxies. A provider-level logistic regression classifier achieves a precision of 0.80, a recall of 0.89, and an F1 score of 0.84 on a stratified hold-out set, surpassing metadata-only baselines by greater than or equal to 0.17 F1 score. Specialty-specific analysis indicates elevated burnout risk among providers in Radiology, Psychiatry, and Neurology. Our findings demonstrate that ICU clinical narratives contain actionable signals for proactive well-being monitoring.

Related papers

SpineBench: A Clinically Salient, Level-Aware Benchmark Powered by the SpineMed-450k Corpus [39.664918145306366]
Spine disorders affect 619 million people globally and are a leading cause of disability.<n>We introduce SpineMed, an ecosystem co-designed with practicing spine surgeons.<n>It features SpineMed-450k, the first large-scale dataset explicitly designed for vertebral-level reasoning.
arXiv Detail & Related papers (2025-10-03T16:32:02Z)
STROKEVISION-BENCH: A Multimodal Video And 2D Pose Benchmark For Tracking Stroke Recovery [41.140934816875806]
We introduce StrokeVision-Bench, the first-ever dedicated dataset of stroke patients performing clinically structured block transfer tasks.<n>StrokeVision-Bench comprises 1,000 annotated videos categorized into four clinically meaningful action classes.<n>We benchmark several state-of-the-art video action recognition and skeleton-based action classification methods to establish performance baselines.
arXiv Detail & Related papers (2025-09-02T18:48:37Z)
LLM-based Prompt Ensemble for Reliable Medical Entity Recognition from EHRs [4.262074310505135]
This paper explores prompt-based medical entity recognition using large language models (LLMs)<n>GPT-4o with prompt ensemble achieved the highest classification performance with an F1-score of 0.95 and recall of 0.98.<n>The ensemble method improved reliability by aggregating outputs through embedding-based similarity and majority voting.
arXiv Detail & Related papers (2025-05-13T16:11:29Z)
Quantifying the Reasoning Abilities of LLMs on Real-world Clinical Cases [48.87360916431396]
We introduce MedR-Bench, a benchmarking dataset of 1,453 structured patient cases, annotated with reasoning references.<n>We propose a framework encompassing three critical examination recommendation, diagnostic decision-making, and treatment planning, simulating the entire patient care journey.<n>Using this benchmark, we evaluate five state-of-the-art reasoning LLMs, including DeepSeek-R1, OpenAI-o3-mini, and Gemini-2.0-Flash Thinking, etc.
arXiv Detail & Related papers (2025-03-06T18:35:39Z)
SemioLLM: Evaluating Large Language Models for Diagnostic Reasoning from Unstructured Clinical Narratives in Epilepsy [45.2233252981348]
Large Language Models (LLMs) have been shown to encode clinical knowledge.<n>We present SemioLLM, an evaluation framework that benchmarks 6 state-of-the-art models.<n>We show that most LLMs are able to accurately and confidently generate probabilistic predictions of seizure onset zones in the brain.
arXiv Detail & Related papers (2024-07-03T11:02:12Z)
Multimodal Pretraining of Medical Time Series and Notes [45.89025874396911]
Deep learning models show promise in extracting meaningful patterns, but they require extensive labeled data. We propose a novel approach employing self-supervised pretraining, focusing on the alignment of clinical measurements and notes. In downstream tasks, including in-hospital mortality prediction and phenotyping, our model outperforms baselines in settings where only a fraction of the data is labeled.
arXiv Detail & Related papers (2023-12-11T21:53:40Z)
Foresight -- Deep Generative Modelling of Patient Timelines using Electronic Health Records [46.024501445093755]
Temporal modelling of medical history can be used to forecast and simulate future events, estimate risk, suggest alternative diagnoses or forecast complications. We present Foresight, a novel GPT3-based pipeline that uses NER+L tools (i.e. MedCAT) to convert document text into structured, coded concepts.
arXiv Detail & Related papers (2022-12-13T19:06:00Z)
A Multimodal Transformer: Fusing Clinical Notes with Structured EHR Data for Interpretable In-Hospital Mortality Prediction [8.625186194860696]
We provide a novel multimodal transformer to fuse clinical notes and structured EHR data for better prediction of in-hospital mortality. To improve interpretability, we propose an integrated gradients (IG) method to select important words in clinical notes. We also investigate the significance of domain adaptive pretraining and task adaptive fine-tuning on the Clinical BERT.
arXiv Detail & Related papers (2022-08-09T03:49:52Z)
Machine Learning to Support Triage of Children at Risk for Epileptic Seizures in the Pediatric Intensive Care Unit [5.708335717084799]
Epileptic seizures are relatively common in critically-ill children admitted to the pediatric intensive care unit (PICU) Children deemed at risk for seizures within the PICU are monitored using continuous-electroencephalogram (cEEG) This research aims to develop a computer aided tool to improve seizures risk assessment in critically-ill children.
arXiv Detail & Related papers (2022-05-11T10:24:58Z)
Classifying Cyber-Risky Clinical Notes by Employing Natural Language Processing [9.77063694539068]
Recently, some states within the United States of America require patients to have open access to their clinical notes. This research investigates methods for identifying security/privacy risks within clinical notes.
arXiv Detail & Related papers (2022-03-24T00:36:59Z)
Collaborative residual learners for automatic icd10 prediction using prescribed medications [45.82374977939355]
We propose a novel collaborative residual learning based model to automatically predict ICD10 codes employing only prescriptions data. We obtain multi-label classification accuracy of 0.71 and 0.57 of average precision, 0.57 and 0.38 of F1-score and 0.73 and 0.44 of accuracy in predicting principal diagnosis for inpatient and outpatient datasets respectively.
arXiv Detail & Related papers (2020-12-16T07:07:27Z)
Ensemble model for pre-discharge icd10 coding prediction [45.82374977939355]
We propose an ensemble model incorporating multiple clinical data sources for accurate code predictions. We obtain multi-label classification accuracies of 0.73 and 0.58 for average precision, 0.56 and 0.35 for F1-scores and 0.71 and 0.4 accuracy in predicting principal diagnosis for inpatient and outpatient datasets respectively.
arXiv Detail & Related papers (2020-12-16T07:02:56Z)
Predicting Clinical Trial Results by Implicit Evidence Integration [40.80948875051806]
We introduce a novel Clinical Trial Result Prediction (CTRP) task. In the CTRP framework, a model takes a PICO-formatted clinical trial proposal with its background as input and predicts the result. We exploit large-scale unstructured sentences from medical literature that implicitly contain PICOs and results as evidence.
arXiv Detail & Related papers (2020-10-12T12:25:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.