Related papers: Evaluating the Impact of Lab Test Results on Large Language Models Generated Differential Diagnoses from Clinical Case Vignettes

Related papers

Evaluating Large Language Models for Multimodal Simulated Ophthalmic Decision-Making in Diabetic Retinopathy and Glaucoma Screening [37.69303106863453]
Large language models (LLMs) can simulate clinical reasoning based on natural language prompts, but their utility in ophthalmology is largely unexplored.<n>This study evaluated GPT-4's ability to interpret structured textual descriptions of retinal fundus photographs.<n>We conducted a retrospective diagnostic validation study using 300 annotated fundus images.
arXiv Detail & Related papers (2025-07-02T01:35:59Z)
Universal Laboratory Model: prognosis of abnormal clinical outcomes based on routine tests [0.0]
Combining routine biochemical panels with the Common Blood Count (CBC) test presents a set of test-value pairs that varies from patient to patient, or, in common settings, a table with missing values.<n>We apply this method to clinical laboratory data to predict high uric acid, glucose, cholesterol, and low ferritin levels.<n>We achieve an improvement up to 8% AUC for joint predictions of high uric acid, glucose, cholesterol, and low ferritin levels.
arXiv Detail & Related papers (2025-06-18T10:10:02Z)
MedHELM: Holistic Evaluation of Large Language Models for Medical Tasks [47.486705282473984]
Large language models (LLMs) achieve near-perfect scores on medical exams.<n>These evaluations inadequately reflect complexity and diversity of real-world clinical practice.<n>We introduce MedHELM, an evaluation framework for assessing LLM performance for medical tasks.
arXiv Detail & Related papers (2025-05-26T22:55:49Z)
Predicting Length of Stay in Neurological ICU Patients Using Classical Machine Learning and Neural Network Models: A Benchmark Study on MIMIC-IV [49.1574468325115]
This study explores multiple ML approaches for predicting LOS in ICU specifically for the patients with neurological diseases based on the MIMIC-IV dataset.<n>The evaluated models include classic ML algorithms (K-Nearest Neighbors, Random Forest, XGBoost and CatBoost) and Neural Networks (LSTM, BERT and Temporal Fusion Transformer)
arXiv Detail & Related papers (2025-05-23T14:06:42Z)
MedCaseReasoning: Evaluating and learning diagnostic reasoning from clinical case reports [49.00805568780791]
We introduce MedCaseReasoning, the first open-access dataset for evaluating Large Language Models (LLMs) on their ability to align with clinician-authored diagnostic reasoning.<n>The dataset includes 14,489 diagnostic question-and-answer cases, each paired with detailed reasoning statements.<n>We evaluate state-of-the-art reasoning LLMs on MedCaseReasoning and find significant shortcomings in their diagnoses and reasoning.
arXiv Detail & Related papers (2025-05-16T22:34:36Z)
ClinicalGPT-R1: Pushing reasoning capability of generalist disease diagnosis with large language model [7.058358371583673]
We introduce ClinicalGPT-R1, a reasoning enhanced generalist large language model for disease diagnosis. Trained on a dataset of 20,000 real-world clinical records, ClinicalGPT-R1 leverages diverse training strategies to enhance diagnostic reasoning.
arXiv Detail & Related papers (2025-04-13T04:00:40Z)
Leveraging LLMs for Predicting Unknown Diagnoses from Clinical Notes [21.43498764977656]
Discharge summaries tend to provide more complete information, which can help infer accurate diagnoses. This study investigates whether large language models (LLMs) can predict implicitly mentioned diagnoses from clinical notes and link them to corresponding medications.
arXiv Detail & Related papers (2025-03-28T02:15:57Z)
Multimodal Lead-Specific Modeling of ECG for Low-Cost Pulmonary Hypertension Assessment [71.69065905466567]
Pulmonary hypertension (PH) is frequently underdiagnosed in low- and middle-income countries (LMICs) due to the scarcity of advanced diagnostic tools. We propose Lead-Specific Electrocardiogram Multimodal Variational Autoencoder (LS-EMVAE), a model pre-trained on large-population 12L-ECG data. LS-EMVAE makes better predictions on both 12L-ECG and 6L-ECG at inference, making it an equitable solution for areas with limited or no diagnostic tools.
arXiv Detail & Related papers (2025-03-03T16:16:38Z)
CardioLab: Laboratory Values Estimation and Monitoring from Electrocardiogram Signals -- A Multimodal Deep Learning Approach [1.068128849363198]
We utilize MIMIC-IV dataset to develop multimodal deep-learning models to demonstrate the feasibility of estimating (real-time) and monitoring (predict at future intervals) laboratory value abnormalities. The models exhibit a strong predictive performance with AUROC scores above 0.70 in a statistically significant manner for 23 laboratory values in the estimation setting and up to 26 values in the monitoring setting.
arXiv Detail & Related papers (2024-11-22T12:10:03Z)
Unlocking Historical Clinical Trial Data with ALIGN: A Compositional Large Language Model System for Medical Coding [44.01429184037945]
We introduce ALIGN, a novel compositional LLM-based system for automated, zero-shot medical coding. We evaluate ALIGN on harmonizing medication terms into Anatomical Therapeutic Chemical (ATC) and medical history terms into Medical Dictionary for Regulatory Activities (MedDRA) codes.
arXiv Detail & Related papers (2024-11-20T09:59:12Z)
Fine-Tuning In-House Large Language Models to Infer Differential Diagnosis from Radiology Reports [1.5972172622800358]
This study introduces a pipeline for developing in-house LLMs tailored to identify differential diagnoses from radiology reports. evaluated on a set of 1,067 reports annotated by clinicians, the proposed model achieves an average F1 score of 92.1%, which is on par with GPT-4.
arXiv Detail & Related papers (2024-10-11T20:16:25Z)
Lab-AI -- Retrieval-Augmented Language Model for Personalized Lab Test Interpretation in Clinical Medicine [8.888389873289913]
Most patient portals use universal normal ranges, ignoring factors like age and gender. This study introduces Lab-AI, an interactive system that offers personalized normal ranges using Retrieval-Augmented Generation (RAG) from credible health sources.
arXiv Detail & Related papers (2024-09-16T20:36:17Z)
Methodology and Real-World Applications of Dynamic Uncertain Causality Graph for Clinical Diagnosis with Explainability and Invariance [41.373856519548404]
Dynamic Uncertain Causality Graph (DUCG) approach is causality-driven, explainable, and invariant across different application scenarios. 46 DUCG models covering 54 chief complaints were constructed. Over one million real diagnosis cases have been performed, with only 17 incorrect diagnoses identified.
arXiv Detail & Related papers (2024-06-09T11:37:45Z)
Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals. Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z)
Enhancing Diagnostic Accuracy through Multi-Agent Conversations: Using Large Language Models to Mitigate Cognitive Bias [5.421033429862095]
Cognitive biases in clinical decision-making significantly contribute to errors in diagnosis and suboptimal patient outcomes. This study explores the role of large language models in mitigating these biases through the utilization of a multi-agent framework.
arXiv Detail & Related papers (2024-01-26T01:35:50Z)
Electromyography Signal Classification Using Deep Learning [0.0]
We have implemented a deep learning model with L2 regularization and trained it on Electromyography (EMG) data. The data comprises of EMG signals collected from control group, myopathy and ALS patients. The model was able to distinguishes the normal cases (control group) from the others at a precision of 100 percent and classify the myopathy and ALS with high accuracy of 97.4 and 98.2 percents, respectively.
arXiv Detail & Related papers (2023-05-06T10:44:38Z)
Learning to diagnose cirrhosis from radiological and histological labels with joint self and weakly-supervised pretraining strategies [62.840338941861134]
We propose to leverage transfer learning from large datasets annotated by radiologists, to predict the histological score available on a small annex dataset. We compare different pretraining methods, namely weakly-supervised and self-supervised ones, to improve the prediction of the cirrhosis. This method outperforms the baseline classification of the METAVIR score, reaching an AUC of 0.84 and a balanced accuracy of 0.75.
arXiv Detail & Related papers (2023-02-16T17:06:23Z)
Deep learning-based COVID-19 pneumonia classification using chest CT images: model generalizability [54.86482395312936]
Deep learning (DL) classification models were trained to identify COVID-19-positive patients on 3D computed tomography (CT) datasets from different countries. We trained nine identical DL-based classification models by using combinations of the datasets with a 72% train, 8% validation, and 20% test data split. The models trained on multiple datasets and evaluated on a test set from one of the datasets used for training performed better.
arXiv Detail & Related papers (2021-02-18T21:14:52Z)
HINT: Hierarchical Interaction Network for Trial Outcome Prediction Leveraging Web Data [56.53715632642495]
Clinical trials face uncertain outcomes due to issues with efficacy, safety, or problems with patient recruitment. In this paper, we propose Hierarchical INteraction Network (HINT) for more general, clinical trial outcome predictions.
arXiv Detail & Related papers (2021-02-08T15:09:07Z)
Collaborative residual learners for automatic icd10 prediction using prescribed medications [45.82374977939355]
We propose a novel collaborative residual learning based model to automatically predict ICD10 codes employing only prescriptions data. We obtain multi-label classification accuracy of 0.71 and 0.57 of average precision, 0.57 and 0.38 of F1-score and 0.73 and 0.44 of accuracy in predicting principal diagnosis for inpatient and outpatient datasets respectively.
arXiv Detail & Related papers (2020-12-16T07:07:27Z)
Identification of Ischemic Heart Disease by using machine learning technique based on parameters measuring Heart Rate Variability [50.591267188664666]
In this study, 18 non-invasive features (age, gender, left ventricular ejection fraction and 15 obtained from HRV) of 243 subjects were used to train and validate a series of several ANN. The best result was obtained using 7 input parameters and 7 hidden nodes with an accuracy of 98.9% and 82% for the training and validation dataset.
arXiv Detail & Related papers (2020-10-29T19:14:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.