CATE Estimation With Potential Outcome Imputation From Local Regression
- URL: http://arxiv.org/abs/2311.03630v2
- Date: Sat, 14 Jun 2025 01:32:21 GMT
- Title: CATE Estimation With Potential Outcome Imputation From Local Regression
- Authors: Ahmed Aloui, Juncheng Dong, Cat P. Le, Vahid Tarokh,
- Abstract summary: We propose a model-agnostic data augmentation method for Conditional Average Treatment Effect estimation.<n>Inspired by this idea, we propose a contrastive learning approach that reliably imputes missing potential outcomes.<n>We provide both theoretical guarantees and extensive numerical studies demonstrating the effectiveness of our approach.
- Score: 24.97657507206549
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One of the most significant challenges in Conditional Average Treatment Effect (CATE) estimation is the statistical discrepancy between distinct treatment groups. To address this issue, we propose a model-agnostic data augmentation method for CATE estimation. First, we derive regret bounds for general data augmentation methods suggesting that a small imputation error may be necessary for accurate CATE estimation. Inspired by this idea, we propose a contrastive learning approach that reliably imputes missing potential outcomes for a selected subset of individuals formed using a similarity measure. We augment the original dataset with these reliable imputations to reduce the discrepancy between different treatment groups while inducing minimal imputation error. The augmented dataset can subsequently be employed to train standard CATE estimation models. We provide both theoretical guarantees and extensive numerical studies demonstrating the effectiveness of our approach in improving the accuracy and robustness of numerous CATE estimation models.
Related papers
- Treatment Effect Estimation for Optimal Decision-Making [65.30942348196443]
We study optimal decision-making based on two-stage CATE estimators.<n>We propose a novel two-stage learning objective that retargets the CATE to balance CATE estimation error and decision performance.
arXiv Detail & Related papers (2025-05-19T13:24:57Z) - PUATE: Efficient Average Treatment Effect Estimation from Treated (Positive) and Unlabeled Units [7.8856737627153874]
We develop semiparametric efficient estimators for ATE in a setting where only a treatment group and an unlabeled group are observed.<n>Our results contribute to literature on causal inference with missing data and weakly supervised learning.
arXiv Detail & Related papers (2025-01-31T17:47:32Z) - Minimax Regret Estimation for Generalizing Heterogeneous Treatment Effects with Multisite Data [3.434624857389692]
We develop a robust CATE (conditional average treatment effect) estimation methodology with multisite data from heterogeneous populations.<n>We show that the resulting CATE model has an interpretable closed-form solution, expressed as a weighted average of site-specific CATE models.
arXiv Detail & Related papers (2024-12-15T10:00:07Z) - Dissecting Representation Misalignment in Contrastive Learning via Influence Function [15.28417468377201]
We introduce the Extended Influence Function for Contrastive Loss (ECIF), an influence function crafted for contrastive loss.
ECIF considers both positive and negative samples and provides a closed-form approximation of contrastive learning models.
Building upon ECIF, we develop a series of algorithms for data evaluation, misalignment detection, and misprediction trace-back tasks.
arXiv Detail & Related papers (2024-11-18T15:45:41Z) - Estimating Conditional Average Treatment Effects via Sufficient Representation Learning [31.822980052107496]
This paper proposes a novel neural network approach named textbfCrossNet to learn a sufficient representation for the features, based on which we then estimate the conditional average treatment effects (CATE)
Numerical simulations and empirical results demonstrate that our method outperforms the competitive approaches.
arXiv Detail & Related papers (2024-08-30T07:23:59Z) - Adaptive-TMLE for the Average Treatment Effect based on Randomized Controlled Trial Augmented with Real-World Data [0.0]
We consider the problem of estimating the average treatment effect (ATE) when both randomized control trial (RCT) data and external real-world data (RWD) are available.<n>We introduce an adaptive targeted maximum likelihood estimation framework to estimate them.
arXiv Detail & Related papers (2024-05-12T07:10:26Z) - Efficient adjustment for complex covariates: Gaining efficiency with
DOPE [56.537164957672715]
We propose a framework that accommodates adjustment for any subset of information expressed by the covariates.
Based on our theoretical results, we propose the Debiased Outcome-adapted Propensity Estorimator (DOPE) for efficient estimation of the average treatment effect (ATE)
Our results show that the DOPE provides an efficient and robust methodology for ATE estimation in various observational settings.
arXiv Detail & Related papers (2024-02-20T13:02:51Z) - Estimation of individual causal effects in network setup for multiple
treatments [4.53340898566495]
We study the problem of estimation of Individual Treatment Effects (ITE) in the context of multiple treatments and observational data.
We employ Graph Convolutional Networks (GCN) to learn a shared representation of the confounders.
Our approach utilizes separate neural networks to infer potential outcomes for each treatment.
arXiv Detail & Related papers (2023-12-18T06:07:45Z) - Targeted Machine Learning for Average Causal Effect Estimation Using the
Front-Door Functional [3.0232957374216953]
evaluating the average causal effect (ACE) of a treatment on an outcome often involves overcoming the challenges posed by confounding factors in observational studies.
Here, we introduce novel estimation strategies for the front-door criterion based on the targeted minimum loss-based estimation theory.
We demonstrate the applicability of these estimators to analyze the effect of early stage academic performance on future yearly income.
arXiv Detail & Related papers (2023-12-15T22:04:53Z) - B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under
Hidden Confounding [51.74479522965712]
We propose a meta-learner called the B-Learner, which can efficiently learn sharp bounds on the CATE function under limits on hidden confounding.
We prove its estimates are valid, sharp, efficient, and have a quasi-oracle property with respect to the constituent estimators under more general conditions than existing methods.
arXiv Detail & Related papers (2023-04-20T18:07:19Z) - Generalization bounds and algorithms for estimating conditional average
treatment effect of dosage [13.867315751451494]
We investigate the task of estimating the conditional average causal effect of treatment-dosage pairs from a combination of observational data and assumptions on the causal relationships in the underlying system.
This has been a longstanding challenge for fields of study such as epidemiology or economics that require a treatment-dosage pair to make decisions.
We show empirically new state-of-the-art performance results across several benchmark datasets for this problem.
arXiv Detail & Related papers (2022-05-29T15:26:59Z) - Assessment of Treatment Effect Estimators for Heavy-Tailed Data [70.72363097550483]
A central obstacle in the objective assessment of treatment effect (TE) estimators in randomized control trials (RCTs) is the lack of ground truth (or validation set) to test their performance.
We provide a novel cross-validation-like methodology to address this challenge.
We evaluate our methodology across 709 RCTs implemented in the Amazon supply chain.
arXiv Detail & Related papers (2021-12-14T17:53:01Z) - Doing Great at Estimating CATE? On the Neglected Assumptions in
Benchmark Comparisons of Treatment Effect Estimators [91.3755431537592]
We show that even in arguably the simplest setting, estimation under ignorability assumptions can be misleading.
We consider two popular machine learning benchmark datasets for evaluation of heterogeneous treatment effect estimators.
We highlight that the inherent characteristics of the benchmark datasets favor some algorithms over others.
arXiv Detail & Related papers (2021-07-28T13:21:27Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Multi-Source Causal Inference Using Control Variates [81.57072928775509]
We propose a general algorithm to estimate causal effects from emphmultiple data sources.
We show theoretically that this reduces the variance of the ATE estimate.
We apply this framework to inference from observational data under an outcome selection bias.
arXiv Detail & Related papers (2021-03-30T21:20:51Z) - Learning from Similarity-Confidence Data [94.94650350944377]
We investigate a novel weakly supervised learning problem of learning from similarity-confidence (Sconf) data.
We propose an unbiased estimator of the classification risk that can be calculated from only Sconf data and show that the estimation error bound achieves the optimal convergence rate.
arXiv Detail & Related papers (2021-02-13T07:31:16Z) - Weighting-Based Treatment Effect Estimation via Distribution Learning [14.438302755258547]
We develop a distribution learning-based weighting method for treatment effect estimation.
Our method outperforms several cutting-edge weighting-only benchmarking methods.
It maintains its advantage under a doubly-robust estimation framework.
arXiv Detail & Related papers (2020-12-26T20:15:44Z) - Increasing the efficiency of randomized trial estimates via linear
adjustment for a prognostic score [59.75318183140857]
Estimating causal effects from randomized experiments is central to clinical research.
Most methods for historical borrowing achieve reductions in variance by sacrificing strict type-I error rate control.
arXiv Detail & Related papers (2020-12-17T21:10:10Z) - Enabling Counterfactual Survival Analysis with Balanced Representations [64.17342727357618]
Survival data are frequently encountered across diverse medical applications, i.e., drug development, risk profiling, and clinical trials.
We propose a theoretically grounded unified framework for counterfactual inference applicable to survival outcomes.
arXiv Detail & Related papers (2020-06-14T01:15:00Z) - Generalization Bounds and Representation Learning for Estimation of
Potential Outcomes and Causal Effects [61.03579766573421]
We study estimation of individual-level causal effects, such as a single patient's response to alternative medication.
We devise representation learning algorithms that minimize our bound, by regularizing the representation's induced treatment group distance.
We extend these algorithms to simultaneously learn a weighted representation to further reduce treatment group distances.
arXiv Detail & Related papers (2020-01-21T10:16:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.