AUTOCT: Automating Interpretable Clinical Trial Prediction with LLM Agents
- URL: http://arxiv.org/abs/2506.04293v1
- Date: Wed, 04 Jun 2025 11:50:55 GMT
- Title: AUTOCT: Automating Interpretable Clinical Trial Prediction with LLM Agents
- Authors: Fengze Liu, Haoyu Wang, Joonhyuk Cho, Dan Roth, Andrew W. Lo,
- Abstract summary: AutoCT is a novel framework that combines the reasoning capabilities of large language models with the explainability of classical machine learning.<n>We show that AutoCT performs on par with or better than SOTA methods on clinical trial prediction tasks within only a limited number of self-refinement iterations.
- Score: 47.640779069547534
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Clinical trials are critical for advancing medical treatments but remain prohibitively expensive and time-consuming. Accurate prediction of clinical trial outcomes can significantly reduce research and development costs and accelerate drug discovery. While recent deep learning models have shown promise by leveraging unstructured data, their black-box nature, lack of interpretability, and vulnerability to label leakage limit their practical use in high-stakes biomedical contexts. In this work, we propose AutoCT, a novel framework that combines the reasoning capabilities of large language models with the explainability of classical machine learning. AutoCT autonomously generates, evaluates, and refines tabular features based on public information without human input. Our method uses Monte Carlo Tree Search to iteratively optimize predictive performance. Experimental results show that AutoCT performs on par with or better than SOTA methods on clinical trial prediction tasks within only a limited number of self-refinement iterations, establishing a new paradigm for scalable, interpretable, and cost-efficient clinical trial prediction.
Related papers
- AutoML-Med: A Framework for Automated Machine Learning in Medical Tabular Data [1.1202352020347177]
This paper introduces AutoML-Med, an Automated Machine Learning tool specifically designed to address these challenges.<n>AutoML-Med's architecture incorporates Latin Hypercube Sampling (LHS) for exploring preprocessing methods, trains models using selected metrics, and utilizes Partial Rank Correlation Coefficient (PRCC) for fine-tuned optimization.<n> Experimental results demonstrate AutoML-Med's effectiveness in two different clinical settings, achieving higher balanced accuracy and sensitivity, which are crucial for identifying at-risk patients.
arXiv Detail & Related papers (2025-08-04T17:13:45Z) - Deep Learning-based Prediction of Clinical Trial Enrollment with Uncertainty Estimates [1.7099366779394252]
Accurately predicting patient enrollment, a key factor in trial success, is one of the primary challenges during the planning phase.<n>We propose a novel deep learning-based method to address this critical challenge.<n>We show that the proposed method can effectively predict the number of patients enrolled at a number of sites for a given clinical trial.
arXiv Detail & Related papers (2025-07-31T14:47:16Z) - Advancing clinical trial outcomes using deep learning and predictive modelling: bridging precision medicine and patient-centered care [0.0]
Deep learning and predictive modelling have emerged as transformative tools for optimizing clinical trial design, patient recruitment, and real-time monitoring.<n>This study explores the application of deep learning techniques, such as convolutional neural networks [CNNs] and transformerbased models, to stratify patients.<n>Predictive modelling approaches, including survival analysis and time-series forecasting, are employed to predict trial outcomes, enhancing efficiency and reducing trial failure rates.
arXiv Detail & Related papers (2024-12-09T23:20:08Z) - Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval [61.70489848327436]
KARE is a novel framework that integrates knowledge graph (KG) community-level retrieval with large language models (LLMs) reasoning.<n>Extensive experiments demonstrate that KARE outperforms leading models by up to 10.8-15.0% on MIMIC-III and 12.6-12.7% on MIMIC-IV for mortality and readmission predictions.
arXiv Detail & Related papers (2024-10-06T18:46:28Z) - ClinicRealm: Re-evaluating Large Language Models with Conventional Machine Learning for Non-Generative Clinical Prediction Tasks [22.539696532725607]
Large Language Models (LLMs) are increasingly deployed in medicine.<n>However, their utility in non-generative clinical prediction remains under-evaluated.<n>Our ClinicRealm study addresses this by benchmarking 9 GPT-based LLMs, 5 BERT-based models, and 7 traditional methods on unstructured clinical notes and structured Electronic Health Records.
arXiv Detail & Related papers (2024-07-26T06:09:10Z) - TrialDura: Hierarchical Attention Transformer for Interpretable Clinical Trial Duration Prediction [19.084936647082632]
We propose TrialDura, a machine learning-based method that estimates the duration of clinical trials using multimodal data.
We encode them into Bio-BERT embeddings specifically tuned for biomedical contexts to provide a deeper and more relevant semantic understanding.
Our proposed model demonstrated superior performance with a mean absolute error (MAE) of 1.04 years and a root mean square error (RMSE) of 1.39 years compared to the other models.
arXiv Detail & Related papers (2024-04-20T02:12:59Z) - TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic
Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment.
In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials.
We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z) - Self-Verification Improves Few-Shot Clinical Information Extraction [73.6905567014859]
Large language models (LLMs) have shown the potential to accelerate clinical curation via few-shot in-context learning.
They still struggle with issues regarding accuracy and interpretability, especially in mission-critical domains such as health.
Here, we explore a general mitigation framework using self-verification, which leverages the LLM to provide provenance for its own extraction and check its own outputs.
arXiv Detail & Related papers (2023-05-30T22:05:11Z) - Benchmarking Heterogeneous Treatment Effect Models through the Lens of
Interpretability [82.29775890542967]
Estimating personalized effects of treatments is a complex, yet pervasive problem.
Recent developments in the machine learning literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools.
We use post-hoc feature importance methods to identify features that influence the model's predictions.
arXiv Detail & Related papers (2022-06-16T17:59:05Z) - Rationale production to support clinical decision-making [31.66739991129112]
We apply InfoCal to the task of predicting hospital readmission using hospital discharge notes.
We find each presented model with selected interpretability or feature importance methods yield varying results.
arXiv Detail & Related papers (2021-11-15T09:02:10Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.