Related papers: Large Language Models are Powerful EHR Encoders

Large Language Models are Powerful EHR Encoders

URL: http://arxiv.org/abs/2502.17403v2
Date: Tue, 04 Mar 2025 16:36:52 GMT
Title: Large Language Models are Powerful EHR Encoders
Authors: Stefan Hegselmann, Georg von Arnim, Tillmann Rheude, Noel Kronenberg, David Sontag, Gerhard Hindricks, Roland Eils, Benjamin Wild,
Abstract summary: Domain-specific EHR foundation models have demonstrated promising improvements in predictive accuracy and generalization.<n>We explore the possibility of using general-purpose Large Language Models (LLMs) based embedding methods as EHR encoders.<n>We evaluate two state-of-the-art LLM-embedding models, GTE-Qwen2-7B-Instruct and LLM2Vec-Llama3.1-8B-Instruct, across 15 diverse clinical prediction tasks.
Score: 4.520903886487343
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Electronic Health Records (EHRs) offer rich potential for clinical prediction, yet their inherent complexity and heterogeneity pose significant challenges for traditional machine learning approaches. Domain-specific EHR foundation models trained on large collections of unlabeled EHR data have demonstrated promising improvements in predictive accuracy and generalization; however, their training is constrained by limited access to diverse, high-quality datasets and inconsistencies in coding standards and healthcare practices. In this study, we explore the possibility of using general-purpose Large Language Models (LLMs) based embedding methods as EHR encoders. By serializing patient records into structured Markdown text, transforming codes into human-readable descriptors, we leverage the extensive generalization capabilities of LLMs pretrained on vast public corpora, thereby bypassing the need for proprietary medical datasets. We systematically evaluate two state-of-the-art LLM-embedding models, GTE-Qwen2-7B-Instruct and LLM2Vec-Llama3.1-8B-Instruct, across 15 diverse clinical prediction tasks from the EHRSHOT benchmark, comparing their performance to an EHRspecific foundation model, CLIMBR-T-Base, and traditional machine learning baselines. Our results demonstrate that LLM-based embeddings frequently match or exceed the performance of specialized models, even in few-shot settings, and that their effectiveness scales with the size of the underlying LLM and the available context window. Overall, our findings demonstrate that repurposing LLMs for EHR encoding offers a scalable and effective approach for clinical prediction, capable of overcoming the limitations of traditional EHR modeling and facilitating more interoperable and generalizable healthcare applications.

Related papers

Adaptable Cardiovascular Disease Risk Prediction from Heterogeneous Data using Large Language Models [70.64969663547703]
AdaCVD is an adaptable CVD risk prediction framework built on large language models extensively fine-tuned on over half a million participants from the UK Biobank.<n>It addresses key clinical challenges across three dimensions: it flexibly incorporates comprehensive yet variable patient information; it seamlessly integrates both structured data and unstructured text; and it rapidly adapts to new patient populations using minimal additional data.
arXiv Detail & Related papers (2025-05-30T14:42:02Z)
LlaMADRS: Prompting Large Language Models for Interview-Based Depression Assessment [75.44934940580112]
This study introduces LlaMADRS, a novel framework leveraging open-source Large Language Models (LLMs) to automate depression severity assessment. We employ a zero-shot prompting strategy with carefully designed cues to guide the model in interpreting and scoring transcribed clinical interviews. Our approach, tested on 236 real-world interviews, demonstrates strong correlations with clinician assessments.
arXiv Detail & Related papers (2025-01-07T08:49:04Z)
HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation [89.3260120072177]
We propose a novel Historical-Constrained Large Language Models (HC-LLM) framework for Radiology report generation. Our approach extracts both time-shared and time-specific features from longitudinal chest X-rays and diagnostic reports to capture disease progression. Notably, our approach performs well even without historical data during testing and can be easily adapted to other multimodal large models.
arXiv Detail & Related papers (2024-12-15T06:04:16Z)
AutoElicit: Using Large Language Models for Expert Prior Elicitation in Predictive Modelling [53.54623137152208]
We introduce AutoElicit to extract knowledge from large language models and construct priors for predictive models.<n>We show these priors are informative and can be refined using natural language.<n>We find that AutoElicit yields priors that can substantially reduce error over uninformative priors, using fewer labels, and consistently outperform in-context learning.
arXiv Detail & Related papers (2024-11-26T10:13:39Z)
Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval [61.70489848327436]
KARE is a novel framework that integrates knowledge graph (KG) community-level retrieval with large language models (LLMs) reasoning. Extensive experiments demonstrate that KARE outperforms leading models by up to 10.8-15.0% on MIMIC-III and 12.6-12.7% on MIMIC-IV for mortality and readmission predictions.
arXiv Detail & Related papers (2024-10-06T18:46:28Z)
Is larger always better? Evaluating and prompting large language models for non-generative medical tasks [11.799956298563844]
This study benchmarks various models, including GPT-based LLMs, BERT-based models, and traditional clinical predictive models. We focused on tasks such as readmission and prediction, disease hierarchy reconstruction, and biomedical sentence matching. Results indicated that LLMs exhibited robust zero-shot predictive capabilities on structured EHR data when using well-designed prompting strategies. For unstructured medical texts, LLMs did not outperform finetuned BERT models, which excelled in both supervised and unsupervised tasks.
arXiv Detail & Related papers (2024-07-26T06:09:10Z)
Developing Healthcare Language Model Embedding Spaces [0.20971479389679337]
Pre-trained Large Language Models (LLMs) often struggle on out-of-domain datasets like healthcare focused text. Three methods are assessed: traditional masked language modeling, Deep Contrastive Learning for Unsupervised Textual Representations (DeCLUTR) and a novel pre-training objective utilizing metadata categories from the healthcare settings. Contrastively trained models outperform other approaches on the classification tasks, delivering strong performance from limited labeled data and with fewer model parameter updates required.
arXiv Detail & Related papers (2024-03-28T19:31:32Z)
REALM: RAG-Driven Enhancement of Multimodal Electronic Health Records Analysis via Large Language Models [19.62552013839689]
Existing models often lack the medical context relevent to clinical tasks, prompting the incorporation of external knowledge. We propose REALM, a Retrieval-Augmented Generation (RAG) driven framework to enhance multimodal EHR representations. Our experiments on MIMIC-III mortality and readmission tasks showcase the superior performance of our REALM framework over baselines.
arXiv Detail & Related papers (2024-02-10T18:27:28Z)
Emergency Department Decision Support using Clinical Pseudo-notes [0.4487265603408873]
We introduce the Multiple Embedding Model for EHR (MEME) MEME serializes multimodal EHR data into text using pseudo-notes, mimicking clinical text generation. We demonstrate the effectiveness of MEME by applying it to several decision support tasks within the Emergency Department across multiple hospital systems.
arXiv Detail & Related papers (2024-01-31T20:31:56Z)
Prompting Large Language Models for Zero-Shot Clinical Prediction with Structured Longitudinal Electronic Health Record Data [7.815738943706123]
Large Language Models (LLMs) are traditionally tailored for natural language processing. This research investigates the adaptability of LLMs, like GPT-4, to EHR data. In response to the longitudinal, sparse, and knowledge-infused nature of EHR data, our prompting approach involves taking into account specific characteristics.
arXiv Detail & Related papers (2024-01-25T20:14:50Z)
Clinical Risk Prediction Using Language Models: Benefits And Considerations [23.781690889237794]
This study focuses on using structured descriptions within vocabularies to make predictions exclusively based on that information. We find that employing LMs to represent structured EHRs leads to improved or at least comparable performance in diverse risk prediction tasks.
arXiv Detail & Related papers (2023-11-29T04:32:19Z)
TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment. In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials. We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z)
Learnable Weight Initialization for Volumetric Medical Image Segmentation [66.3030435676252]
We propose a learnable weight-based hybrid medical image segmentation approach. Our approach is easy to integrate into any hybrid model and requires no external training data. Experiments on multi-organ and lung cancer segmentation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-15T17:55:05Z)
Interpretable Medical Diagnostics with Structured Data Extraction by Large Language Models [59.89454513692417]
Tabular data is often hidden in text, particularly in medical diagnostic reports. We propose a novel, simple, and effective methodology for extracting structured tabular data from textual medical reports, called TEMED-LLM. We demonstrate that our approach significantly outperforms state-of-the-art text classification models in medical diagnostics.
arXiv Detail & Related papers (2023-06-08T09:12:28Z)
An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT [80.33783969507458]
The 'Impression' section of a radiology report is a critical basis for communication between radiologists and other physicians. Recent studies have achieved promising results in automatic impression generation using large-scale medical text data. These models often require substantial amounts of medical text data and have poor generalization performance.
arXiv Detail & Related papers (2023-04-17T17:13:42Z)
Large Language Models for Healthcare Data Augmentation: An Example on Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM) Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z)
Clinical Deterioration Prediction in Brazilian Hospitals Based on Artificial Neural Networks and Tree Decision Models [56.93322937189087]
An extremely boosted neural network (XBNet) is used to predict clinical deterioration (CD) The XGBoost model obtained the best results in predicting CD among Brazilian hospitals' data.
arXiv Detail & Related papers (2022-12-17T23:29:14Z)
SANSformers: Self-Supervised Forecasting in Electronic Health Records with Attention-Free Models [48.07469930813923]
This work aims to forecast the demand for healthcare services, by predicting the number of patient visits to healthcare facilities. We introduce SANSformer, an attention-free sequential model designed with specific inductive biases to cater for the unique characteristics of EHR data. Our results illuminate the promising potential of tailored attention-free models and self-supervised pretraining in refining healthcare utilization predictions across various patient demographics.
arXiv Detail & Related papers (2021-08-31T08:23:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.