Overlap-weighted orthogonal meta-learner for treatment effect estimation over time
- URL: http://arxiv.org/abs/2510.19643v1
- Date: Wed, 22 Oct 2025 14:47:57 GMT
- Title: Overlap-weighted orthogonal meta-learner for treatment effect estimation over time
- Authors: Konstantin Hess, Dennis Frauen, Mihaela van der Schaar, Stefan Feuerriegel,
- Abstract summary: We introduce a novel overlap-weighted meta-learner for estimating heterogeneous treatment effects (HTEs)<n>Our WO-learner has the favorable property of Neyman-orthogonality, meaning that it is robust against misspecification in the nuisance functions.<n>We show that our WO-learner is fully model-agnostic and can be applied to any machine learning model.
- Score: 90.46786193198744
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Estimating heterogeneous treatment effects (HTEs) in time-varying settings is particularly challenging, as the probability of observing certain treatment sequences decreases exponentially with longer prediction horizons. Thus, the observed data contain little support for many plausible treatment sequences, which creates severe overlap problems. Existing meta-learners for the time-varying setting typically assume adequate treatment overlap, and thus suffer from exploding estimation variance when the overlap is low. To address this problem, we introduce a novel overlap-weighted orthogonal (WO) meta-learner for estimating HTEs that targets regions in the observed data with high probability of receiving the interventional treatment sequences. This offers a fully data-driven approach through which our WO-learner can counteract instabilities as in existing meta-learners and thus obtain more reliable HTE estimates. Methodologically, we develop a novel Neyman-orthogonal population risk function that minimizes the overlap-weighted oracle risk. We show that our WO-learner has the favorable property of Neyman-orthogonality, meaning that it is robust against misspecification in the nuisance functions. Further, our WO-learner is fully model-agnostic and can be applied to any machine learning model. Through extensive experiments with both transformer and LSTM backbones, we demonstrate the benefits of our novel WO-learner.
Related papers
- Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations.<n>We demonstrate that in this setting, the generalized cross validation estimator (GCV) fails to correctly predict the out-of-sample risk.<n>We further extend our analysis to the case where the test point has nontrivial correlations with the training set, a setting often encountered in time series forecasting.
arXiv Detail & Related papers (2024-08-08T17:27:29Z) - Multi-CATE: Multi-Accurate Conditional Average Treatment Effect Estimation Robust to Unknown Covariate Shifts [12.289361708127876]
We use methodology for learning multi-accurate predictors to post-process CATE T-learners.
We show how this approach can combine (large) confounded observational and (smaller) randomized datasets.
arXiv Detail & Related papers (2024-05-28T14:12:25Z) - Neural parameter calibration and uncertainty quantification for epidemic
forecasting [0.0]
We apply a novel and powerful computational method to the problem of learning probability densities on contagion parameters.
Using a neural network, we calibrate an ODE model to data of the spread of COVID-19 in Berlin in 2020.
We show convergence of our method to the true posterior on a simplified SIR model of epidemics, and also demonstrate our method's learning capabilities on a reduced dataset.
arXiv Detail & Related papers (2023-12-05T21:34:59Z) - B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under
Hidden Confounding [51.74479522965712]
We propose a meta-learner called the B-Learner, which can efficiently learn sharp bounds on the CATE function under limits on hidden confounding.
We prove its estimates are valid, sharp, efficient, and have a quasi-oracle property with respect to the constituent estimators under more general conditions than existing methods.
arXiv Detail & Related papers (2023-04-20T18:07:19Z) - Proximal Causal Learning of Conditional Average Treatment Effects [0.0]
We propose a tailored two-stage loss function for learning heterogeneous treatment effects.
Our proposed estimator can be implemented by off-the-shelf loss-minimizing machine learning methods.
arXiv Detail & Related papers (2023-01-26T02:56:36Z) - Comparison of meta-learners for estimating multi-valued treatment
heterogeneous effects [2.294014185517203]
Conditional Average Treatment Effects (CATE) estimation is one of the main challenges in causal inference with observational data.
Nonparametric estimators called meta-learners have been developed to estimate the CATE with the main advantage of not restraining the estimation to a specific supervised learning method.
This paper looks into meta-learners for estimating the heterogeneous effects of multi-valued treatments.
arXiv Detail & Related papers (2022-05-29T16:46:21Z) - SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event
Data [83.50281440043241]
We study the problem of inferring heterogeneous treatment effects from time-to-event data.
We propose a novel deep learning method for treatment-specific hazard estimation based on balancing representations.
arXiv Detail & Related papers (2021-10-26T20:13:17Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Estimating heterogeneous survival treatment effect in observational data
using machine learning [9.951103976634407]
Methods for estimating heterogeneous treatment effect in observational data have largely focused on continuous or binary outcomes.
Using flexible machine learning methods in the counterfactual framework is a promising approach to address challenges due to complex individual characteristics.
arXiv Detail & Related papers (2020-08-17T01:02:14Z) - Clinical Risk Prediction with Temporal Probabilistic Asymmetric
Multi-Task Learning [80.66108902283388]
Multi-task learning methods should be used with caution for safety-critical applications, such as clinical risk prediction.
Existing asymmetric multi-task learning methods tackle this negative transfer problem by performing knowledge transfer from tasks with low loss to tasks with high loss.
We propose a novel temporal asymmetric multi-task learning model that performs knowledge transfer from certain tasks/timesteps to relevant uncertain tasks, based on feature-level uncertainty.
arXiv Detail & Related papers (2020-06-23T06:01:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.