On Aligning Prediction Models with Clinical Experiential Learning: A Prostate Cancer Case Study
- URL: http://arxiv.org/abs/2509.04053v1
- Date: Thu, 04 Sep 2025 09:32:19 GMT
- Title: On Aligning Prediction Models with Clinical Experiential Learning: A Prostate Cancer Case Study
- Authors: Jacqueline J. Vallon, William Overman, Wanqiao Xu, Neil Panjwani, Xi Ling, Sushmita Vij, Hilary P. Bagshaw, John T. Leppert, Sumit Shah, Geoffrey Sonn, Sandy Srinivas, Erqi Pollom, Mark K. Buyyounouski, Mohsen Bayati,
- Abstract summary: We present a framework for investigating this misalignment between model behavior and clinical experiential learning.<n>We first identify and address these inconsistencies by incorporating clinical knowledge, collected by a survey, via constraints into the ML model.<n>The approach shows that aligning the ML model with clinical experiential learning is possible without compromising performance.
- Score: 5.877851309205959
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Over the past decade, the use of machine learning (ML) models in healthcare applications has rapidly increased. Despite high performance, modern ML models do not always capture patterns the end user requires. For example, a model may predict a non-monotonically decreasing relationship between cancer stage and survival, keeping all other features fixed. In this paper, we present a reproducible framework for investigating this misalignment between model behavior and clinical experiential learning, focusing on the effects of underspecification of modern ML pipelines. In a prostate cancer outcome prediction case study, we first identify and address these inconsistencies by incorporating clinical knowledge, collected by a survey, via constraints into the ML model, and subsequently analyze the impact on model performance and behavior across degrees of underspecification. The approach shows that aligning the ML model with clinical experiential learning is possible without compromising performance. Motivated by recent literature in generative AI, we further examine the feasibility of a feedback-driven alignment approach in non-generative AI clinical risk prediction models through a randomized experiment with clinicians. Our findings illustrate that, by eliciting clinicians' model preferences using our proposed methodology, the larger the difference in how the constrained and unconstrained models make predictions for a patient, the more apparent the difference is in clinical interpretation.
Related papers
- Investigating the Impact of Histopathological Foundation Models on Regressive Prediction of Homologous Recombination Deficiency [52.50039435394964]
We systematically evaluate foundation models for regression-based tasks.<n>We extract patch-level features from whole slide images (WSI) using five state-of-the-art foundation models.<n>Models are trained to predict continuous HRD scores based on these extracted features across breast, endometrial, and lung cancer cohorts.
arXiv Detail & Related papers (2026-01-29T14:06:50Z) - Clinical semantics for lung cancer prediction [1.6744500686720596]
Existing clinical prediction models often represent patient data using features that ignore semantic relationships between clinical concepts.<n>This study integrates domain-specific semantic information by mapping the SNOMED medical term hierarchy into a low-dimensional hyperbolic space.
arXiv Detail & Related papers (2025-08-20T11:29:47Z) - Evaluating Machine Learning Models against Clinical Protocols for Enhanced Interpretability and Continuity of Care [39.58317527488534]
In clinical practice, decision-making relies heavily on established protocols, often formalised as rules.
Despite the growing number of Machine Learning applications, their adoption into clinical practice remains limited.
We propose metrics to assess the accuracy of ML models with respect to the established protocol.
arXiv Detail & Related papers (2024-11-05T13:50:09Z) - CPLLM: Clinical Prediction with Large Language Models [0.07083082555458872]
We present a method that involves fine-tuning a pre-trained Large Language Model (LLM) for clinical disease and readmission prediction.
For diagnosis prediction, we predict whether patients will be diagnosed with a target disease during their next visit or in the subsequent diagnosis, leveraging their historical diagnosis records.
Our experiments have shown that our proposed method, CPLLM, surpasses all the tested models in terms of PR-AUC and ROC-AUC metrics.
arXiv Detail & Related papers (2023-09-20T13:24:12Z) - CTP:A Causal Interpretable Model for Non-Communicable Disease
Progression Prediction [12.282670150417953]
We propose a novel model called causal trajectory prediction (CTP) to tackle the limitation.
CTP combines trajectory prediction and causal discovery to enable accurate prediction of disease progression trajectories.
We evaluate the performance of the model using simulated and real medical datasets.
arXiv Detail & Related papers (2023-08-18T06:58:31Z) - Benchmarking Heterogeneous Treatment Effect Models through the Lens of
Interpretability [82.29775890542967]
Estimating personalized effects of treatments is a complex, yet pervasive problem.
Recent developments in the machine learning literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools.
We use post-hoc feature importance methods to identify features that influence the model's predictions.
arXiv Detail & Related papers (2022-06-16T17:59:05Z) - Clinical outcome prediction under hypothetical interventions -- a
representation learning framework for counterfactual reasoning [31.97813934144506]
We introduce a new representation learning framework, which considers the provision of counterfactual explanations as an embedded property of the risk model.
Our results suggest that our proposed framework has the potential to help researchers and clinicians improve personalised care.
arXiv Detail & Related papers (2022-05-15T09:41:16Z) - What Do You See in this Patient? Behavioral Testing of Clinical NLP
Models [69.09570726777817]
We introduce an extendable testing framework that evaluates the behavior of clinical outcome models regarding changes of the input.
We show that model behavior varies drastically even when fine-tuned on the same data and that allegedly best-performing models have not always learned the most medically plausible patterns.
arXiv Detail & Related papers (2021-11-30T15:52:04Z) - A multi-stage machine learning model on diagnosis of esophageal
manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage.
This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z) - Integrating Expert ODEs into Neural ODEs: Pharmacology and Disease
Progression [71.7560927415706]
latent hybridisation model (LHM) integrates a system of expert-designed ODEs with machine-learned Neural ODEs to fully describe the dynamics of the system.
We evaluate LHM on synthetic data as well as real-world intensive care data of COVID-19 patients.
arXiv Detail & Related papers (2021-06-05T11:42:45Z) - Bayesian prognostic covariate adjustment [59.75318183140857]
Historical data about disease outcomes can be integrated into the analysis of clinical trials in many ways.
We build on existing literature that uses prognostic scores from a predictive model to increase the efficiency of treatment effect estimates.
arXiv Detail & Related papers (2020-12-24T05:19:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.