Related papers: Multiply Robust Estimator Circumvents Hyperparameter Tuning of Neural Network Models in Causal Inference

Multiply Robust Estimator Circumvents Hyperparameter Tuning of Neural Network Models in Causal Inference

URL: http://arxiv.org/abs/2307.10536v1
Date: Thu, 20 Jul 2023 02:31:12 GMT
Title: Multiply Robust Estimator Circumvents Hyperparameter Tuning of Neural Network Models in Causal Inference
Authors: Mehdi Rostami, Olli Saarela
Abstract summary: Multiply Robust (MR) estimator allows us to leverage all the first-step models in a single estimator. We show that MR is the solution to a broad class of estimating equations, and is also consistent if one of the treatment models is $sqrtn$ consistent.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Estimation of the Average Treatment Effect (ATE) is often carried out in 2 steps, wherein the first step, the treatment and outcome are modeled, and in the second step the predictions are inserted into the ATE estimator. In the first steps, numerous models can be fit to the treatment and outcome, including using machine learning algorithms. However, it is a difficult task to choose among the hyperparameter sets which will result in the best causal effect estimation and inference. Multiply Robust (MR) estimator allows us to leverage all the first-step models in a single estimator. We show that MR estimator is $n^r$ consistent if one of the first-step treatment or outcome models is $n^r$ consistent. We also show that MR is the solution to a broad class of estimating equations, and is asymptotically normal if one of the treatment models is $\sqrt{n}$-consistent. The standard error of MR is also calculated which does not require a knowledge of the true models in the first step. Our simulations study supports the theoretical findings.

Related papers

It's Hard to Be Normal: The Impact of Noise on Structure-agnostic Estimation [38.61560534969323]
We study how well one can estimate a treatment effect given black-box machine learning estimates of nuisance functions.<n>We find that the answer depends in a surprising way on the distribution of the treatment noise.
arXiv Detail & Related papers (2025-07-03T03:31:45Z)
Double Machine Learning for Conditional Moment Restrictions: IV Regression, Proximal Causal Learning and Beyond [16.842233444365764]
Conditional moment restrictions (CMRs) are a key problem considered in statistics, causal inference, and econometrics.<n>Most CMR estimators use a two-stage approach, where the first-stage estimation is directly plugged into the second stage to estimate the function of interest.<n>We propose DML-CMR, a two-stage CMR estimator that provides an unbiased estimate with fast convergence rate guarantees.
arXiv Detail & Related papers (2025-06-17T20:00:34Z)
Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective. The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning. The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z)
Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems. Such problems are encountered in medicine, physics, and machine learning. We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z)
A Statistical Theory of Regularization-Based Continual Learning [10.899175512941053]
We provide a statistical analysis of regularization-based continual learning on a sequence of linear regression tasks. We first derive the convergence rate for the oracle estimator obtained as if all data were available simultaneously. A byproduct of our theoretical analysis is the equivalence between early stopping and generalized $ell$-regularization.
arXiv Detail & Related papers (2024-06-10T12:25:13Z)
Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models. In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z)
Toward Theoretical Guidance for Two Common Questions in Practical Cross-Validation based Hyperparameter Selection [72.76113104079678]
We show the first theoretical treatments of two common questions in cross-validation based hyperparameter selection. We show that these generalizations can, respectively, always perform at least as well as always performing retraining or never performing retraining.
arXiv Detail & Related papers (2023-01-12T16:37:12Z)
Sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression. Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates. The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z)
Benign-Overfitting in Conditional Average Treatment Effect Prediction with Linear Regression [14.493176427999028]
We study the benign overfitting theory in the prediction of the conditional average treatment effect (CATE) with linear regression models. We show that the T-learner fails to achieve the consistency except the random assignment, while the IPW-learner converges the risk to zero if the propensity score is known.
arXiv Detail & Related papers (2022-02-10T18:51:52Z)
Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models. In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints. A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z)
The Bias-Variance Tradeoff of Doubly Robust Estimator with Targeted $L_1$ regularized Neural Networks Predictions [0.0]
The Doubly Robust (DR) estimation of ATE can be carried out in 2 steps, where in the first step, the treatment and outcome are modeled, and in the second step the predictions are inserted into the DR estimator. The model misspecification in the first step has led researchers to utilize Machine Learning algorithms instead of parametric algorithms.
arXiv Detail & Related papers (2021-08-02T15:41:27Z)
Higher-Order Orthogonal Causal Learning for Treatment Effect [15.652550362252205]
We present an algorithm that enables us to obtain the debiased estimator recovered from the score function. We also undergo comprehensive experiments to test the power of the estimator we construct from the score function using both the simulated datasets and the real datasets.
arXiv Detail & Related papers (2021-03-22T14:04:13Z)
Estimation in Tensor Ising Models [5.161531917413708]
We consider the problem of estimating the natural parameter of the $p$-tensor Ising model given a single sample from the distribution on $N$ nodes. In particular, we show the $sqrt N$-consistency of the MPL estimate in the $p$-spin Sherrington-Kirkpatrick (SK) model. We derive the precise fluctuations of the MPL estimate in the special case of the $p$-tensor Curie-Weiss model.
arXiv Detail & Related papers (2020-08-29T00:06:58Z)
Machine learning for causal inference: on the use of cross-fit estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties. We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE) When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.