Causal Machine Learning Methods for Estimating Personalised Treatment Effects -- Insights on validity from two large trials
- URL: http://arxiv.org/abs/2501.04061v1
- Date: Tue, 07 Jan 2025 09:44:05 GMT
- Title: Causal Machine Learning Methods for Estimating Personalised Treatment Effects -- Insights on validity from two large trials
- Authors: Hongruyu Chen, Helena Aebersold, Milo Alan Puhan, Miquel Serra-Burriel,
- Abstract summary: Causal machine learning (ML) methods hold great promise for advancing precision medicine.
In this study, we assessed the internal and external validity of 17 mainstream causal ML methods.
- Score: 0.0
- License:
- Abstract: Causal machine learning (ML) methods hold great promise for advancing precision medicine by estimating personalized treatment effects. However, their reliability remains largely unvalidated in empirical settings. In this study, we assessed the internal and external validity of 17 mainstream causal heterogeneity ML methods -- including metalearners, tree-based methods, and deep learning methods -- using data from two large randomized controlled trials: the International Stroke Trial (N=19,435) and the Chinese Acute Stroke Trial (N=21,106). Our findings reveal that none of the ML methods reliably validated their performance, neither internal nor external, showing significant discrepancies between training and test data on the proposed evaluation metrics. The individualized treatment effects estimated from training data failed to generalize to the test data, even in the absence of distribution shifts. These results raise concerns about the current applicability of causal ML models in precision medicine, and highlight the need for more robust validation techniques to ensure generalizability.
Related papers
- ML-assisted Randomization Tests for Detecting Treatment Effects in A/B Experiments [3.79377147545355]
In this paper, we construct randomization tests for complex treatment effects.
A key feature of our approach is the use of flexible machine learning (ML) models.
This approach combines the predictive power of modern ML tools with the finite-sample validity of randomization procedures.
arXiv Detail & Related papers (2025-01-13T22:14:58Z) - Measuring Variable Importance in Heterogeneous Treatment Effects with Confidence [33.12963161545068]
Causal machine learning holds promise for estimating individual treatment effects from complex data.
We propose PermuCATE, an algorithm based on the Conditional Permutation Importance (CPI) method.
We empirically demonstrate the benefits of PermuCATE in simulated and real-world health datasets.
arXiv Detail & Related papers (2024-08-23T11:44:07Z) - Estimating Causal Effects with Double Machine Learning -- A Method Evaluation [5.904095466127043]
We review one of the most prominent methods - "double/debiased machine learning" (DML)
Our findings indicate that the application of a suitably flexible machine learning algorithm within DML improves the adjustment for various nonlinear confounding relationships.
When estimating the effects of air pollution on housing prices, we find that DML estimates are consistently larger than estimates of less flexible methods.
arXiv Detail & Related papers (2024-03-21T13:21:33Z) - Mixed-Integer Projections for Automated Data Correction of EMRs Improve
Predictions of Sepsis among Hospitalized Patients [7.639610349097473]
We introduce an innovative projections-based method that seamlessly integrates clinical expertise as domain constraints.
We measure the distance of corrected data from the constraints defining a healthy range of patient data, resulting in a unique predictive metric we term as "trust-scores"
We show an AUROC of 0.865 and a precision of 0.922, that surpasses conventional ML models without such projections.
arXiv Detail & Related papers (2023-08-21T15:14:49Z) - A Double Machine Learning Approach to Combining Experimental and Observational Data [59.29868677652324]
We propose a double machine learning approach to combine experimental and observational studies.
Our framework tests for violations of external validity and ignorability under milder assumptions.
arXiv Detail & Related papers (2023-07-04T02:53:11Z) - B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under
Hidden Confounding [51.74479522965712]
We propose a meta-learner called the B-Learner, which can efficiently learn sharp bounds on the CATE function under limits on hidden confounding.
We prove its estimates are valid, sharp, efficient, and have a quasi-oracle property with respect to the constituent estimators under more general conditions than existing methods.
arXiv Detail & Related papers (2023-04-20T18:07:19Z) - Benchmarking Heterogeneous Treatment Effect Models through the Lens of
Interpretability [82.29775890542967]
Estimating personalized effects of treatments is a complex, yet pervasive problem.
Recent developments in the machine learning literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools.
We use post-hoc feature importance methods to identify features that influence the model's predictions.
arXiv Detail & Related papers (2022-06-16T17:59:05Z) - Comparison of meta-learners for estimating multi-valued treatment
heterogeneous effects [2.294014185517203]
Conditional Average Treatment Effects (CATE) estimation is one of the main challenges in causal inference with observational data.
Nonparametric estimators called meta-learners have been developed to estimate the CATE with the main advantage of not restraining the estimation to a specific supervised learning method.
This paper looks into meta-learners for estimating the heterogeneous effects of multi-valued treatments.
arXiv Detail & Related papers (2022-05-29T16:46:21Z) - Assessment of Treatment Effect Estimators for Heavy-Tailed Data [70.72363097550483]
A central obstacle in the objective assessment of treatment effect (TE) estimators in randomized control trials (RCTs) is the lack of ground truth (or validation set) to test their performance.
We provide a novel cross-validation-like methodology to address this challenge.
We evaluate our methodology across 709 RCTs implemented in the Amazon supply chain.
arXiv Detail & Related papers (2021-12-14T17:53:01Z) - SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event
Data [83.50281440043241]
We study the problem of inferring heterogeneous treatment effects from time-to-event data.
We propose a novel deep learning method for treatment-specific hazard estimation based on balancing representations.
arXiv Detail & Related papers (2021-10-26T20:13:17Z) - Interpretable Off-Policy Evaluation in Reinforcement Learning by
Highlighting Influential Transitions [48.91284724066349]
Off-policy evaluation in reinforcement learning offers the chance of using observational data to improve future outcomes in domains such as healthcare and education.
Traditional measures such as confidence intervals may be insufficient due to noise, limited data and confounding.
We develop a method that could serve as a hybrid human-AI system, to enable human experts to analyze the validity of policy evaluation estimates.
arXiv Detail & Related papers (2020-02-10T00:26:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.