A Partial Initialization Strategy to Mitigate the Overfitting Problem in CATE Estimation with Hidden Confounding
- URL: http://arxiv.org/abs/2501.08888v2
- Date: Sun, 26 Jan 2025 02:30:20 GMT
- Title: A Partial Initialization Strategy to Mitigate the Overfitting Problem in CATE Estimation with Hidden Confounding
- Authors: Chuan Zhou, Yaxuan Li, Chunyuan Zheng, Haiteng Zhang, Haoxuan Li, Mingming Gong,
- Abstract summary: Estimating the conditional average treatment effect (CATE) from observational data plays a crucial role in areas such as e-commerce, healthcare, and economics.
Existing studies mainly rely on the strong ignorability assumption that there are no hidden confounders.
Data collected from randomized controlled trials (RCT) do not suffer from confounding but are usually limited by a small sample size.
- Score: 44.874826691991565
- License:
- Abstract: Estimating the conditional average treatment effect (CATE) from observational data plays a crucial role in areas such as e-commerce, healthcare, and economics. Existing studies mainly rely on the strong ignorability assumption that there are no hidden confounders, whose existence cannot be tested from observational data and can invalidate any causal conclusion. In contrast, data collected from randomized controlled trials (RCT) do not suffer from confounding but are usually limited by a small sample size. To avoid overfitting caused by the small-scale RCT data, we propose a novel two-stage pretraining-finetuning (TSPF) framework with a partial parameter initialization strategy to estimate the CATE in the presence of hidden confounding. In the first stage, a foundational representation of covariates is trained to estimate counterfactual outcomes through large-scale observational data. In the second stage, we propose to train an augmented representation of the covariates, which is concatenated with the foundational representation obtained in the first stage to adjust for the hidden confounding. Rather than training a separate network from scratch, part of the prediction heads are initialized from the first stage. The superiority of our approach is validated on two datasets with extensive experiments.
Related papers
- Causal Lifting of Neural Representations: Zero-Shot Generalization for Causal Inferences [56.23412698865433]
We focus on causal inferences on a target experiment with unlabeled factual outcomes, retrieved by a predictive model fine-tuned on a labeled similar experiment.
First, we show that factual outcome estimation via Empirical Risk Minimization (ERM) may fail to yield valid causal inferences on the target population.
We propose Deconfounded Empirical Risk Minimization (DERM), a new simple learning procedure minimizing the risk over a fictitious target population.
arXiv Detail & Related papers (2025-02-10T10:52:17Z) - Combining Incomplete Observational and Randomized Data for Heterogeneous Treatment Effects [10.9134216137537]
Existing methods for integrating observational data with randomized data must require textitcomplete observational data.
We propose a resilient approach to textbfCombine textbfIncomplete textbfObservational data and randomized data for HTE estimation.
arXiv Detail & Related papers (2024-10-28T06:19:14Z) - RCT Rejection Sampling for Causal Estimation Evaluation [25.845034753006367]
Confounding is a significant obstacle to unbiased estimation of causal effects from observational data.
We build on a promising empirical evaluation strategy that simplifies evaluation design and uses real data.
We show our algorithm indeed results in low bias when oracle estimators are evaluated on confounded samples.
arXiv Detail & Related papers (2023-07-27T20:11:07Z) - Conservative Prediction via Data-Driven Confidence Minimization [70.93946578046003]
In safety-critical applications of machine learning, it is often desirable for a model to be conservative.
We propose the Data-Driven Confidence Minimization framework, which minimizes confidence on an uncertainty dataset.
arXiv Detail & Related papers (2023-06-08T07:05:36Z) - Falsification before Extrapolation in Causal Effect Estimation [6.715453431174765]
Causal effects in populations are often estimated using observational datasets.
We propose a meta-algorithm that attempts to reject observational estimates that are biased.
arXiv Detail & Related papers (2022-09-27T21:47:23Z) - Scale-Equivalent Distillation for Semi-Supervised Object Detection [57.59525453301374]
Recent Semi-Supervised Object Detection (SS-OD) methods are mainly based on self-training, generating hard pseudo-labels by a teacher model on unlabeled data as supervisory signals.
We analyze the challenges these methods meet with the empirical experiment results.
We introduce a novel approach, Scale-Equivalent Distillation (SED), which is a simple yet effective end-to-end knowledge distillation framework robust to large object size variance and class imbalance.
arXiv Detail & Related papers (2022-03-23T07:33:37Z) - Individual Treatment Effect Estimation Through Controlled Neural Network
Training in Two Stages [0.757024681220677]
We develop a Causal-Deep Neural Network model trained in two stages to infer causal impact estimates at an individual unit level.
We observe that CDNN is highly competitive and often yields the most accurate individual treatment effect estimates.
arXiv Detail & Related papers (2022-01-21T06:34:52Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Quantifying Ignorance in Individual-Level Causal-Effect Estimates under
Hidden Confounding [38.09565581056218]
We study the problem of learning conditional average treatment effects (CATE) from high-dimensional, observational data with unobserved confounders.
We present a new parametric interval estimator suited for high-dimensional data.
arXiv Detail & Related papers (2021-03-08T15:58:06Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.