Related papers: How many patients could we save with LLM priors?

How many patients could we save with LLM priors?

URL: http://arxiv.org/abs/2509.04250v1
Date: Thu, 04 Sep 2025 14:23:35 GMT
Title: How many patients could we save with LLM priors?
Authors: Shota Arai, David Selby, Andrew Vargo, Sebastian Vollmer,
Abstract summary: We present a novel framework for hierarchical Bayesian modeling of adverse events in multi-center clinical trials.<n>Our methodology directly obtains priors from the model using a pre-trained large language model (LLMs)<n>This methodology paves the way for more efficient and expert-informed clinical trial design.
Score: 1.8421433205488897
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Imagine a world where clinical trials need far fewer patients to achieve the same statistical power, thanks to the knowledge encoded in large language models (LLMs). We present a novel framework for hierarchical Bayesian modeling of adverse events in multi-center clinical trials, leveraging LLM-informed prior distributions. Unlike data augmentation approaches that generate synthetic data points, our methodology directly obtains parametric priors from the model. Our approach systematically elicits informative priors for hyperparameters in hierarchical Bayesian models using a pre-trained LLM, enabling the incorporation of external clinical expertise directly into Bayesian safety modeling. Through comprehensive temperature sensitivity analysis and rigorous cross-validation on real-world clinical trial data, we demonstrate that LLM-derived priors consistently improve predictive performance compared to traditional meta-analytical approaches. This methodology paves the way for more efficient and expert-informed clinical trial design, enabling substantial reductions in the number of patients required to achieve robust safety assessment and with the potential to transform drug safety monitoring and regulatory decision making.

Related papers

LiveClin: A Live Clinical Benchmark without Leakage [50.45415584327275]
LiveClin is a live benchmark designed for approximating real-world clinical practice.<n>We transform authentic patient cases into complex, multimodal evaluation scenarios that span the entire clinical pathway.<n>Our evaluation of 26 models on LiveClin reveals the profound difficulty of these real-world scenarios, with the top-performing model achieving a Case Accuracy of just 35.7%.
arXiv Detail & Related papers (2026-02-18T03:59:46Z)
A Federated and Parameter-Efficient Framework for Large Language Model Training in Medicine [59.78991974851707]
Large language models (LLMs) have demonstrated strong performance on medical benchmarks, including question answering and diagnosis.<n>Most medical LLMs are trained on data from a single institution, which faces limitations in generalizability and safety in heterogeneous systems.<n>We introduce the model-agnostic and parameter-efficient federated learning framework for adapting LLMs to medical applications.
arXiv Detail & Related papers (2026-01-29T18:48:21Z)
Expert-guided Clinical Text Augmentation via Query-Based Model Collaboration [13.279553235224988]
Large language models (LLMs) have demonstrated strong generative capabilities for this purpose.<n>Their applications in high-stakes domains like healthcare present unique challenges due to the risk of generating clinically incorrect or misleading information.<n>We propose a novel query-based model collaboration framework that integrates expert-level domain knowledge to guide the augmentation process.
arXiv Detail & Related papers (2025-09-25T20:18:39Z)
Deep Learning-based Prediction of Clinical Trial Enrollment with Uncertainty Estimates [1.7099366779394252]
Accurately predicting patient enrollment, a key factor in trial success, is one of the primary challenges during the planning phase.<n>We propose a novel deep learning-based method to address this critical challenge.<n>We show that the proposed method can effectively predict the number of patients enrolled at a number of sites for a given clinical trial.
arXiv Detail & Related papers (2025-07-31T14:47:16Z)
AutoElicit: Using Large Language Models for Expert Prior Elicitation in Predictive Modelling [53.54623137152208]
We introduce AutoElicit to extract knowledge from large language models and construct priors for predictive models.<n>We show these priors are informative and can be refined using natural language.<n>We find that AutoElicit yields priors that can substantially reduce error over uninformative priors, using fewer labels, and consistently outperform in-context learning.
arXiv Detail & Related papers (2024-11-26T10:13:39Z)
Stronger Baseline Models -- A Key Requirement for Aligning Machine Learning Research with Clinical Utility [0.0]
Well-known barriers exist when attempting to deploy Machine Learning models in high-stakes, clinical settings. We show empirically that including stronger baseline models in evaluations has important downstream effects. We propose some best practices that will enable practitioners to more effectively study and deploy ML models in clinical settings.
arXiv Detail & Related papers (2024-09-18T16:38:37Z)
Towards Automatic Evaluation for LLMs' Clinical Capabilities: Metric, Data, and Algorithm [15.627870862369784]
Large language models (LLMs) are gaining increasing interests to improve clinical efficiency for medical diagnosis. We propose an automatic evaluation paradigm tailored to assess the LLMs' capabilities in delivering clinical services.
arXiv Detail & Related papers (2024-03-25T06:17:54Z)
Large Language Model Distilling Medication Recommendation Model [58.94186280631342]
We harness the powerful semantic comprehension and input-agnostic characteristics of Large Language Models (LLMs)<n>Our research aims to transform existing medication recommendation methodologies using LLMs.<n>To mitigate this, we have developed a feature-level knowledge distillation technique, which transfers the LLM's proficiency to a more compact model.
arXiv Detail & Related papers (2024-02-05T08:25:22Z)
Improving Clinical Decision Support through Interpretable Machine Learning and Error Handling in Electronic Health Records [6.594072648536156]
Trust-MAPS translates clinical domain knowledge into high-dimensional, mixed-integer programming models.<n>Trust-scores emerge as clinically meaningful features that not only boost predictive performance for clinical decision support tasks, but also lend interpretability to ML models.
arXiv Detail & Related papers (2023-08-21T15:14:49Z)
Large Language Models for Healthcare Data Augmentation: An Example on Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM) Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z)
Clinical Outcome Prediction from Admission Notes using Self-Supervised Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks. Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets. We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z)
MIA-Prognosis: A Deep Learning Framework to Predict Therapy Response [58.0291320452122]
This paper aims at a unified deep learning approach to predict patient prognosis and therapy response. We formalize the prognosis modeling as a multi-modal asynchronous time series classification task. Our predictive model could further stratify low-risk and high-risk patients in terms of long-term survival.
arXiv Detail & Related papers (2020-10-08T15:30:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.