Will Large Language Models Transform Clinical Prediction?
- URL: http://arxiv.org/abs/2505.18246v2
- Date: Thu, 06 Nov 2025 13:47:08 GMT
- Title: Will Large Language Models Transform Clinical Prediction?
- Authors: Yusuf Yildiz, Goran Nenadic, Meghna Jani, David A. Jenkins,
- Abstract summary: Large language models (LLMs) are attracting increasing interest in healthcare.<n>This commentary evaluates the potential of LLMs to improve clinical prediction models (CPMs) for diagnostic and prognostic tasks.
- Score: 6.239284099493876
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Objective: Large language models (LLMs) are attracting increasing interest in healthcare. This commentary evaluates the potential of LLMs to improve clinical prediction models (CPMs) for diagnostic and prognostic tasks, with a focus on their ability to process longitudinal electronic health record (EHR) data. Findings: LLMs show promise in handling multimodal and longitudinal EHR data and can support multi-outcome predictions for diverse health conditions. However, methodological, validation, infrastructural, and regulatory chal- lenges remain. These include inadequate methods for time-to-event modelling, poor calibration of predictions, limited external validation, and bias affecting underrepresented groups. High infrastructure costs and the absence of clear regulatory frameworks further prevent adoption. Implications: Further work and interdisciplinary collaboration are needed to support equitable and effective integra- tion into the clinical prediction. Developing temporally aware, fair, and explainable models should be a priority focus for transforming clinical prediction workflow.
Related papers
- Beyond Traditional Diagnostics: Transforming Patient-Side Information into Predictive Insights with Knowledge Graphs and Prototypes [55.310195121276074]
We propose a Knowledge graph-enhanced, Prototype-aware, and Interpretable (KPI) framework to predict diseases.<n>It integrates structured and trusted medical knowledge into a unified disease knowledge graph, constructs clinically meaningful disease prototypes, and employs contrastive learning to enhance predictive accuracy.<n>It provides clinically valid explanations that closely align with patient narratives, highlighting its practical value for patient-centered healthcare delivery.
arXiv Detail & Related papers (2025-12-09T05:37:54Z) - CLARITY: Medical World Model for Guiding Treatment Decisions by Modeling Context-Aware Disease Trajectories in Latent Space [49.74032713886216]
CLARITY is a medical world model that forecasts disease evolution directly within a structured latent space.<n>It explicitly integrates time intervals (temporal context) and patient-specific data (clinical context) to model treatment-conditioned progression as a smooth, interpretable trajectory.
arXiv Detail & Related papers (2025-12-08T20:42:10Z) - Training and Evaluation of Guideline-Based Medical Reasoning in LLMs [7.814266948607376]
Machine learning for early prediction in medicine has recently shown breakthrough performance.<n>The goal of this paper is to teach LLMs to follow medical consensus guidelines step-by-step in their reasoning and prediction process.
arXiv Detail & Related papers (2025-12-03T14:39:02Z) - Beyond Generative AI: World Models for Clinical Prediction, Counterfactuals, and Planning [2.800335887892072]
World models learn predictive dynamics to enable multistep rollouts, counterfactual evaluation and planning.<n>Most reviewed systems achieve L1--L2, with fewer instances of L3 and rare L4 planning/control.<n>This review outlines a research agenda for clinically robust prediction-first world models.
arXiv Detail & Related papers (2025-11-20T13:11:45Z) - Medical Reasoning in the Era of LLMs: A Systematic Review of Enhancement Techniques and Applications [59.721265428780946]
Large Language Models (LLMs) in medicine have enabled impressive capabilities, yet a critical gap remains in their ability to perform systematic, transparent, and verifiable reasoning.<n>This paper provides the first systematic review of this emerging field.<n>We propose a taxonomy of reasoning enhancement techniques, categorized into training-time strategies and test-time mechanisms.
arXiv Detail & Related papers (2025-08-01T14:41:31Z) - Towards Interpretable Renal Health Decline Forecasting via Multi-LMM Collaborative Reasoning Framework [12.732588046754783]
We propose a collaborative framework that enhances the performance of open-source LMMs for eGFR forecasting.<n>It incorporates visual knowledge transfer, abductive reasoning, and a short-term memory mechanism to enhance prediction accuracy and interpretability.<n>Our method sheds new light on building AI systems for healthcare that combine predictive accuracy with clinically grounded interpretability.
arXiv Detail & Related papers (2025-07-30T08:11:06Z) - Uncertainty-Driven Expert Control: Enhancing the Reliability of Medical Vision-Language Models [52.2001050216955]
Existing methods aim to enhance the performance of Medical Vision Language Model (MedVLM) by adjusting model structure, fine-tuning with high-quality data, or through preference fine-tuning.<n>We propose an expert-in-the-loop framework named Expert-Controlled-Free Guidance (Expert-CFG) to align MedVLM with clinical expertise without additional training.
arXiv Detail & Related papers (2025-07-12T09:03:30Z) - KERAP: A Knowledge-Enhanced Reasoning Approach for Accurate Zero-shot Diagnosis Prediction Using Multi-agent LLMs [39.47350988195002]
Large language models (LLMs) have shown promise in leveraging language abilities and biomedical knowledge for diagnosis prediction.<n>We propose KERAP, a knowledge graph (KG)-enhanced reasoning approach that improves LLM-based diagnosis prediction through a multi-agent architecture.<n>Our framework consists of a linkage agent for mapping, a retrieval agent for structured knowledge extraction, and a prediction agent that iteratively refines diagnosis predictions.
arXiv Detail & Related papers (2025-07-03T16:35:11Z) - Hallucinations and Key Information Extraction in Medical Texts: A Comprehensive Assessment of Open-Source Large Language Models [3.1406146587437904]
Large language models (LLMs) have shown significant potential in automating and improving the accuracy of such summarizations.<n>We investigate the effectiveness of open-source LLMs in extracting key events from discharge reports.<n>We also assess the prevalence of various types of hallucinations in the summaries produced by these models.
arXiv Detail & Related papers (2025-04-27T00:39:12Z) - Structured Outputs Enable General-Purpose LLMs to be Medical Experts [50.02627258858336]
Large language models (LLMs) often struggle with open-ended medical questions.<n>We propose a novel approach utilizing structured medical reasoning.<n>Our approach achieves the highest Factuality Score of 85.8, surpassing fine-tuned models.
arXiv Detail & Related papers (2025-03-05T05:24:55Z) - Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval [61.70489848327436]
KARE is a novel framework that integrates knowledge graph (KG) community-level retrieval with large language models (LLMs) reasoning.<n>Extensive experiments demonstrate that KARE outperforms leading models by up to 10.8-15.0% on MIMIC-III and 12.6-12.7% on MIMIC-IV for mortality and readmission predictions.
arXiv Detail & Related papers (2024-10-06T18:46:28Z) - The Role of Language Models in Modern Healthcare: A Comprehensive Review [2.048226951354646]
The application of large language models (LLMs) in healthcare has gained significant attention.
This review examines the trajectory of language models from their early stages to the current state-of-the-art LLMs.
arXiv Detail & Related papers (2024-09-25T12:15:15Z) - ClinicRealm: Re-evaluating Large Language Models with Conventional Machine Learning for Non-Generative Clinical Prediction Tasks [37.544994716002016]
Large Language Models (LLMs) are increasingly deployed in medicine.<n>However, their utility in non-generative clinical prediction remains under-evaluated.<n>Our ClinicRealm study addresses this by benchmarking 15 GPT-style LLMs, 5 BERT-style models, and 11 traditional methods.
arXiv Detail & Related papers (2024-07-26T06:09:10Z) - Large language models in healthcare and medical domain: A review [4.456243157307507]
Large language models (LLMs) provide proficient responses to free-text queries.
This review explores the potential of LLMs to amplify the efficiency and effectiveness of diverse healthcare applications.
arXiv Detail & Related papers (2023-12-12T20:54:51Z) - Large Language Models Illuminate a Progressive Pathway to Artificial
Healthcare Assistant: A Review [16.008511195589925]
Large language models (LLMs) have shown promising capabilities in mimicking human-level language comprehension and reasoning.
This paper provides a comprehensive review on the applications and implications of LLMs in medicine.
arXiv Detail & Related papers (2023-11-03T13:51:36Z) - Clairvoyance: A Pipeline Toolkit for Medical Time Series [95.22483029602921]
Time-series learning is the bread and butter of data-driven *clinical decision support*
Clairvoyance proposes a unified, end-to-end, autoML-friendly pipeline that serves as a software toolkit.
Clairvoyance is the first to demonstrate viability of a comprehensive and automatable pipeline for clinical time-series ML.
arXiv Detail & Related papers (2023-10-28T12:08:03Z) - Self-Verification Improves Few-Shot Clinical Information Extraction [73.6905567014859]
Large language models (LLMs) have shown the potential to accelerate clinical curation via few-shot in-context learning.
They still struggle with issues regarding accuracy and interpretability, especially in mission-critical domains such as health.
Here, we explore a general mitigation framework using self-verification, which leverages the LLM to provide provenance for its own extraction and check its own outputs.
arXiv Detail & Related papers (2023-05-30T22:05:11Z) - Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks.
Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets.
We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.