Improving Mortality Prediction After Radiotherapy with Large Language Model Structuring of Large-Scale Unstructured Electronic Health Records
- URL: http://arxiv.org/abs/2408.05074v5
- Date: Wed, 11 Dec 2024 10:14:32 GMT
- Title: Improving Mortality Prediction After Radiotherapy with Large Language Model Structuring of Large-Scale Unstructured Electronic Health Records
- Authors: Sangjoon Park, Chan Woo Wee, Seo Hee Choi, Kyung Hwan Kim, Jee Suk Chang, Hong In Yoon, Ik Jae Lee, Yong Bae Kim, Jaeho Cho, Ki Chang Keum, Chang Geol Lee, Hwa Kyung Byun, Woong Sub Koom,
- Abstract summary: This study developed and validated the RTSurv framework to structure unstructured electronic health records alongside structured clinical data.
Using unstructured data from 34,276 patients and an external cohort of 852, the framework successfully transformed unstructured information into structured formats.
- Score: 2.608410928225647
- License:
- Abstract: Accurate survival prediction in radiotherapy (RT) is critical for optimizing treatment decisions. This study developed and validated the RT-Surv framework, which integrates general-domain, open-source large language models (LLMs) to structure unstructured electronic health records alongside structured clinical data. Using data from 34,276 patients and an external cohort of 852, the framework successfully transformed unstructured clinical information into structured formats. Incorporating LLM-structured clinical features improved the concordance index from 0.779 to 0.842 during external validation, demonstrating a significant performance enhancement. Key LLM-structured features, such as disease extent, general condition, and RT purpose, showed high predictive importance and aligned closely with statistically significant predictors identified through conventional statistical analyses, thereby improving model interpretability. Furthermore, the framework enhanced risk stratification, enabling more distinct differentiation among low-, intermediate-, and high-risk groups (p < 0.001) using LLM-structured clinical features. These findings highlight the potential of LLMs to convert unstructured data into actionable insights, improving predictive modeling and patient outcomes in clinics.
Related papers
- Enhancing In-Hospital Mortality Prediction Using Multi-Representational Learning with LLM-Generated Expert Summaries [3.5508427067904864]
In-hospital mortality (IHM) prediction for ICU patients is critical for timely interventions and efficient resource allocation.
This study integrates structured physiological data and clinical notes with Large Language Model (LLM)-generated expert summaries to improve IHM prediction accuracy.
arXiv Detail & Related papers (2024-11-25T16:36:38Z) - Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval [61.70489848327436]
KARE is a novel framework that integrates knowledge graph (KG) community-level retrieval with large language models (LLMs) reasoning.
Extensive experiments demonstrate that KARE outperforms leading models by up to 10.8-15.0% on MIMIC-III and 12.6-12.7% on MIMIC-IV for mortality and readmission predictions.
arXiv Detail & Related papers (2024-10-06T18:46:28Z) - DALL-M: Context-Aware Clinical Data Augmentation with LLMs [13.827368628263997]
Radiologists often find chest X-rays insufficient for diagnosing underlying diseases.
We present a novel framework to enhance the clinical context through augmentation techniques with clinical data.
We introduce a pioneering approach to clinical data augmentation that employs large language models to generate patient contextual synthetic data.
arXiv Detail & Related papers (2024-07-11T07:01:50Z) - XAI for In-hospital Mortality Prediction via Multimodal ICU Data [57.73357047856416]
We propose an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data.
We employ multimodal learning in our framework, which can receive heterogeneous inputs from clinical data and make decisions.
Our framework can be easily transferred to other clinical tasks, which facilitates the discovery of crucial factors in healthcare research.
arXiv Detail & Related papers (2023-12-29T14:28:04Z) - ChatRadio-Valuer: A Chat Large Language Model for Generalizable
Radiology Report Generation Based on Multi-institution and Multi-system Data [115.0747462486285]
ChatRadio-Valuer is a tailored model for automatic radiology report generation that learns generalizable representations.
The clinical dataset utilized in this study encompasses a remarkable total of textbf332,673 observations.
ChatRadio-Valuer consistently outperforms state-of-the-art models, especially ChatGPT (GPT-3.5-Turbo) and GPT-4 et al.
arXiv Detail & Related papers (2023-10-08T17:23:17Z) - TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic
Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment.
In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials.
We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z) - Large Language Models for Healthcare Data Augmentation: An Example on
Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM)
Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z) - A Multimodal Transformer: Fusing Clinical Notes with Structured EHR Data
for Interpretable In-Hospital Mortality Prediction [8.625186194860696]
We provide a novel multimodal transformer to fuse clinical notes and structured EHR data for better prediction of in-hospital mortality.
To improve interpretability, we propose an integrated gradients (IG) method to select important words in clinical notes.
We also investigate the significance of domain adaptive pretraining and task adaptive fine-tuning on the Clinical BERT.
arXiv Detail & Related papers (2022-08-09T03:49:52Z) - Early Prediction of Mortality in Critical Care Setting in Sepsis
Patients Using Structured Features and Unstructured Clinical Notes [4.387308555401595]
Using the MIMIC-III database, we integrated demographic data, physiological measurements and clinical notes.
We built and applied several machine learning models to predict the risk of hospital mortality and 30-day mortality in sepsis patients.
arXiv Detail & Related papers (2021-11-09T19:57:05Z) - Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks.
Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets.
We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z) - Predicting Clinical Trial Results by Implicit Evidence Integration [40.80948875051806]
We introduce a novel Clinical Trial Result Prediction (CTRP) task.
In the CTRP framework, a model takes a PICO-formatted clinical trial proposal with its background as input and predicts the result.
We exploit large-scale unstructured sentences from medical literature that implicitly contain PICOs and results as evidence.
arXiv Detail & Related papers (2020-10-12T12:25:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.