Related papers: Intervention Efficiency and Perturbation Validation Framework: Capacity-Aware and Robust Clinical Model Selection under the Rashomon Effect

Intervention Efficiency and Perturbation Validation Framework: Capacity-Aware and Robust Clinical Model Selection under the Rashomon Effect

URL: http://arxiv.org/abs/2511.14317v2
Date: Sun, 23 Nov 2025 17:59:46 GMT
Title: Intervention Efficiency and Perturbation Validation Framework: Capacity-Aware and Robust Clinical Model Selection under the Rashomon Effect
Authors: Yuwen Zhang, Viet Tran, Paul Weng,
Abstract summary: coexistence of multiple models with comparable performance poses fundamental challenges for trustworthy deployment and evaluation.<n>We propose two complementary tools for robust model assessment and selection: Intervention Efficiency (IE) and the Perturbation Validation Framework (PVF)<n>IE is a capacity-aware metric that quantifies how efficiently a model identifies actionable true positives when only limited interventions are feasible.<n>PVF introduces a structured approach to assess the stability of models under data perturbations, identifying models whose performance remains most invariant across noisy or shifted validation sets.
Score: 8.16102315566872
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In clinical machine learning, the coexistence of multiple models with comparable performance -- a manifestation of the Rashomon Effect -- poses fundamental challenges for trustworthy deployment and evaluation. Small, imbalanced, and noisy datasets, coupled with high-dimensional and weakly identified clinical features, amplify this multiplicity and make conventional validation schemes unreliable. As a result, selecting among equally performing models becomes uncertain, particularly when resource constraints and operational priorities are not considered by conventional metrics like F1 score. To address these issues, we propose two complementary tools for robust model assessment and selection: Intervention Efficiency (IE) and the Perturbation Validation Framework (PVF). IE is a capacity-aware metric that quantifies how efficiently a model identifies actionable true positives when only limited interventions are feasible, thereby linking predictive performance with clinical utility. PVF introduces a structured approach to assess the stability of models under data perturbations, identifying models whose performance remains most invariant across noisy or shifted validation sets. Empirical results on synthetic and real-world healthcare datasets show that using these tools facilitates the selection of models that generalize more robustly and align with capacity constraints, offering a new direction for tackling the Rashomon Effect in clinical settings.

Related papers

Diagnostics for Individual-Level Prediction Instability in Machine Learning for Healthcare [0.0]
We propose an evaluation framework that quantifies individual-level prediction instability by using two complementary diagnostics.<n>We apply these diagnostics to simulated data and GUSTO-I clinical dataset.
arXiv Detail & Related papers (2026-02-27T03:42:28Z)
A Comparative Study of Controllability, Explainability, and Performance in Dysfluency Detection Models [6.837099592935974]
We compare four dysfluency modeling approaches: YOLO-Stutter, FluentNet, UDM, and SSDM.<n>YOLO-Stutter and FluentNet provide efficiency and simplicity, but with limited transparency.<n>UDM achieves the best balance of accuracy and clinical interpretability.
arXiv Detail & Related papers (2025-08-25T14:23:09Z)
Beyond the ATE: Interpretable Modelling of Treatment Effects over Dose and Time [46.2482873419289]
We propose a framework for modelling treatment effect trajectories as smooth surfaces over dose and time.<n>Our approach decouples the estimation of trajectory shape from the specification of clinically relevant properties.<n>We show that our method yields accurate, interpretable, and editable models of treatment dynamics.
arXiv Detail & Related papers (2025-07-09T20:33:33Z)
Improving Omics-Based Classification: The Role of Feature Selection and Synthetic Data Generation [0.18846515534317262]
This study presents a machine learning based classification framework that integrates feature selection with data augmentation techniques.<n>We show that the proposed pipeline yields cross validated perfomance on small dataset.
arXiv Detail & Related papers (2025-05-06T10:09:50Z)
MOSIC: Model-Agnostic Optimal Subgroup Identification with Multi-Constraint for Improved Reliability [11.997050225896679]
We propose a unified optimization framework that directly solves the primal constrained optimization problem to identify optimal subgroups.<n>Our key innovation is a reformulation of the constrained primal problem as an unconstrained differentiable min-max objective, solved via a gradient descent-ascent algorithm.<n>The framework is model-agnostic, compatible with a wide range of CATE estimators, and propensity to additional constraints like cost limits or fairness criteria.
arXiv Detail & Related papers (2025-04-29T16:25:23Z)
Bridging the Generalisation Gap: Synthetic Data Generation for Multi-Site Clinical Model Validation [0.3362278589492841]
Existing model evaluation approaches often rely on real-world datasets, which are limited in availability, embed confounding biases, and lack flexibility needed for systematic experimentation.<n>We propose a novel structured synthetic data framework designed for the controlled robustness of benchmarking model, fairness, and generalisability.
arXiv Detail & Related papers (2025-04-29T11:04:28Z)
TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment. In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials. We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z)
Benchmarking Heterogeneous Treatment Effect Models through the Lens of Interpretability [82.29775890542967]
Estimating personalized effects of treatments is a complex, yet pervasive problem. Recent developments in the machine learning literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools. We use post-hoc feature importance methods to identify features that influence the model's predictions.
arXiv Detail & Related papers (2022-06-16T17:59:05Z)
HyperImpute: Generalized Iterative Imputation with Automatic Model Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models. We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z)
Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management. We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
Decomposed Adversarial Learned Inference [118.27187231452852]
We propose a novel approach, Decomposed Adversarial Learned Inference (DALI) DALI explicitly matches prior and conditional distributions in both data and code spaces. We validate the effectiveness of DALI on the MNIST, CIFAR-10, and CelebA datasets.
arXiv Detail & Related papers (2020-04-21T20:00:35Z)
Estimating the Effects of Continuous-valued Interventions using Generative Adversarial Networks [103.14809802212535]
We build on the generative adversarial networks (GANs) framework to address the problem of estimating the effect of continuous-valued interventions. Our model, SCIGAN, is flexible and capable of simultaneously estimating counterfactual outcomes for several different continuous interventions. To address the challenges presented by shifting to continuous interventions, we propose a novel architecture for our discriminator.
arXiv Detail & Related papers (2020-02-27T18:46:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.