Multiply Robust Estimator Circumvents Hyperparameter Tuning of Neural
Network Models in Causal Inference
- URL: http://arxiv.org/abs/2307.10536v1
- Date: Thu, 20 Jul 2023 02:31:12 GMT
- Title: Multiply Robust Estimator Circumvents Hyperparameter Tuning of Neural
Network Models in Causal Inference
- Authors: Mehdi Rostami, Olli Saarela
- Abstract summary: Multiply Robust (MR) estimator allows us to leverage all the first-step models in a single estimator.
We show that MR is the solution to a broad class of estimating equations, and is also consistent if one of the treatment models is $sqrtn$ consistent.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Estimation of the Average Treatment Effect (ATE) is often carried out in 2
steps, wherein the first step, the treatment and outcome are modeled, and in
the second step the predictions are inserted into the ATE estimator. In the
first steps, numerous models can be fit to the treatment and outcome, including
using machine learning algorithms. However, it is a difficult task to choose
among the hyperparameter sets which will result in the best causal effect
estimation and inference. Multiply Robust (MR) estimator allows us to leverage
all the first-step models in a single estimator. We show that MR estimator is
$n^r$ consistent if one of the first-step treatment or outcome models is $n^r$
consistent. We also show that MR is the solution to a broad class of estimating
equations, and is asymptotically normal if one of the treatment models is
$\sqrt{n}$-consistent. The standard error of MR is also calculated which does
not require a knowledge of the true models in the first step. Our simulations
study supports the theoretical findings.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - A Statistical Theory of Regularization-Based Continual Learning [10.899175512941053]
We provide a statistical analysis of regularization-based continual learning on a sequence of linear regression tasks.
We first derive the convergence rate for the oracle estimator obtained as if all data were available simultaneously.
A byproduct of our theoretical analysis is the equivalence between early stopping and generalized $ell$-regularization.
arXiv Detail & Related papers (2024-06-10T12:25:13Z) - Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative
Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models.
In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z) - Toward Theoretical Guidance for Two Common Questions in Practical
Cross-Validation based Hyperparameter Selection [72.76113104079678]
We show the first theoretical treatments of two common questions in cross-validation based hyperparameter selection.
We show that these generalizations can, respectively, always perform at least as well as always performing retraining or never performing retraining.
arXiv Detail & Related papers (2023-01-12T16:37:12Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Benign-Overfitting in Conditional Average Treatment Effect Prediction
with Linear Regression [14.493176427999028]
We study the benign overfitting theory in the prediction of the conditional average treatment effect (CATE) with linear regression models.
We show that the T-learner fails to achieve the consistency except the random assignment, while the IPW-learner converges the risk to zero if the propensity score is known.
arXiv Detail & Related papers (2022-02-10T18:51:52Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - The Bias-Variance Tradeoff of Doubly Robust Estimator with Targeted
$L_1$ regularized Neural Networks Predictions [0.0]
The Doubly Robust (DR) estimation of ATE can be carried out in 2 steps, where in the first step, the treatment and outcome are modeled, and in the second step the predictions are inserted into the DR estimator.
The model misspecification in the first step has led researchers to utilize Machine Learning algorithms instead of parametric algorithms.
arXiv Detail & Related papers (2021-08-02T15:41:27Z) - Higher-Order Orthogonal Causal Learning for Treatment Effect [15.652550362252205]
We present an algorithm that enables us to obtain the debiased estimator recovered from the score function.
We also undergo comprehensive experiments to test the power of the estimator we construct from the score function using both the simulated datasets and the real datasets.
arXiv Detail & Related papers (2021-03-22T14:04:13Z) - Estimation in Tensor Ising Models [5.161531917413708]
We consider the problem of estimating the natural parameter of the $p$-tensor Ising model given a single sample from the distribution on $N$ nodes.
In particular, we show the $sqrt N$-consistency of the MPL estimate in the $p$-spin Sherrington-Kirkpatrick (SK) model.
We derive the precise fluctuations of the MPL estimate in the special case of the $p$-tensor Curie-Weiss model.
arXiv Detail & Related papers (2020-08-29T00:06:58Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.