The Framework That Survives Bad Models: Human-AI Collaboration For Clinical Trials
- URL: http://arxiv.org/abs/2510.06567v1
- Date: Wed, 08 Oct 2025 01:40:41 GMT
- Title: The Framework That Survives Bad Models: Human-AI Collaboration For Clinical Trials
- Authors: Yao Chen, David Ohlssen, Aimee Readie, Gregory Ligozio, Ruvie Martin, Thibaud Coroller,
- Abstract summary: Using AI as a supporting reader (AI-SR) is the most suitable approach for clinical trials, as it meets all criteria across various model types, even with bad models.<n>This method consistently provides reliable disease estimation, preserves clinical trial treatment effect estimates and conclusions, and retains these advantages when applied to different populations.
- Score: 2.6377299508948746
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Artificial intelligence (AI) holds great promise for supporting clinical trials, from patient recruitment and endpoint assessment to treatment response prediction. However, deploying AI without safeguards poses significant risks, particularly when evaluating patient endpoints that directly impact trial conclusions. We compared two AI frameworks against human-only assessment for medical image-based disease evaluation, measuring cost, accuracy, robustness, and generalization ability. To stress-test these frameworks, we injected bad models, ranging from random guesses to naive predictions, to ensure that observed treatment effects remain valid even under severe model degradation. We evaluated the frameworks using two randomized controlled trials with endpoints derived from spinal X-ray images. Our findings indicate that using AI as a supporting reader (AI-SR) is the most suitable approach for clinical trials, as it meets all criteria across various model types, even with bad models. This method consistently provides reliable disease estimation, preserves clinical trial treatment effect estimates and conclusions, and retains these advantages when applied to different populations.
Related papers
- A Causal Machine Learning Framework for Treatment Personalization in Clinical Trials: Application to Ulcerative Colitis [0.7799711162530713]
We present a modular causal machine learning framework that evaluates each question separately.<n>We apply this framework to patient-level data from the UNIFI maintenance trial of ustekinumab in ulcerative colitis.
arXiv Detail & Related papers (2026-02-09T00:26:30Z) - A systematic evaluation of uncertainty quantification techniques in deep learning: a case study in photoplethysmography signal analysis [1.6690512882610855]
Deep learning models can be used to continuously monitor physiological parameters outside of clinical settings.<n>There is risk of poor performance when deployed in practical measurement scenarios leading to negative patient outcomes.<n>Here we implement eight uncertainty (UQ) techniques to models trained on two clinically relevant prediction tasks.
arXiv Detail & Related papers (2025-10-31T22:54:13Z) - Assessing the robustness of heterogeneous treatment effects in survival analysis under informative censoring [50.164756034797136]
Dropout is common in clinical studies, with up to half of patients leaving early due to side effects or other reasons.<n>When dropout is informative, it introduces censoring bias, because of which treatment effect estimates are also biased.<n>We propose an assumption-lean framework to assess the robustness of conditional average treatment effect estimates in survival analysis when facing censoring bias.
arXiv Detail & Related papers (2025-10-15T10:51:17Z) - Ethical considerations of use of hold-out sets in clinical prediction model management [0.4194295877935868]
We focus on the ethical principles of beneficence, non-maleficence, autonomy and justice.
We also discuss statistical issues arising from different hold-out set sampling methods.
arXiv Detail & Related papers (2024-06-05T11:42:46Z) - TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic
Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment.
In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials.
We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z) - Evaluation of Popular XAI Applied to Clinical Prediction Models: Can
They be Trusted? [2.0089256058364358]
The absence of transparency and explainability hinders the clinical adoption of Machine learning (ML) algorithms.
This study evaluates two popular XAI methods used for explaining predictive models in the healthcare context.
arXiv Detail & Related papers (2023-06-21T02:29:30Z) - Improving Image-Based Precision Medicine with Uncertainty-Aware Causal
Models [3.5770353345663053]
We use Bayesian deep learning for estimating the posterior distribution over factual and counterfactual outcomes on several treatments.
We train and evaluate this model to predict future new and enlarging T2 lesion counts on a large, multi-center dataset of MR brain images of patients with multiple sclerosis.
arXiv Detail & Related papers (2023-05-05T20:08:40Z) - What Do You See in this Patient? Behavioral Testing of Clinical NLP
Models [69.09570726777817]
We introduce an extendable testing framework that evaluates the behavior of clinical outcome models regarding changes of the input.
We show that model behavior varies drastically even when fine-tuned on the same data and that allegedly best-performing models have not always learned the most medically plausible patterns.
arXiv Detail & Related papers (2021-11-30T15:52:04Z) - Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks.
Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets.
We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z) - Bayesian prognostic covariate adjustment [59.75318183140857]
Historical data about disease outcomes can be integrated into the analysis of clinical trials in many ways.
We build on existing literature that uses prognostic scores from a predictive model to increase the efficiency of treatment effect estimates.
arXiv Detail & Related papers (2020-12-24T05:19:03Z) - MIA-Prognosis: A Deep Learning Framework to Predict Therapy Response [58.0291320452122]
This paper aims at a unified deep learning approach to predict patient prognosis and therapy response.
We formalize the prognosis modeling as a multi-modal asynchronous time series classification task.
Our predictive model could further stratify low-risk and high-risk patients in terms of long-term survival.
arXiv Detail & Related papers (2020-10-08T15:30:17Z) - Hemogram Data as a Tool for Decision-making in COVID-19 Management:
Applications to Resource Scarcity Scenarios [62.997667081978825]
COVID-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure.
This work describes a machine learning model derived from hemogram exam data performed in symptomatic patients.
Proposed models can predict COVID-19 qRT-PCR results in symptomatic individuals with high accuracy, sensitivity and specificity.
arXiv Detail & Related papers (2020-05-10T01:45:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.