Related papers: Deep Learning Methods for the Noniterative Conditional Expectation G-Formula for Causal Inference from Complex Observational Data

Deep Learning Methods for the Noniterative Conditional Expectation G-Formula for Causal Inference from Complex Observational Data

URL: http://arxiv.org/abs/2410.21531v1
Date: Mon, 28 Oct 2024 21:00:46 GMT
Title: Deep Learning Methods for the Noniterative Conditional Expectation G-Formula for Causal Inference from Complex Observational Data
Authors: Sophia M Rein, Jing Li, Miguel Hernan, Andrew Beam,
Abstract summary: The g-formula can be used to estimate causal effects of sustained treatment strategies using observational data. Parametric models are subject to model misspecification, which may result in biased causal estimates. We propose a unified deep learning framework for the NICE g-formula estimator.
Score: 3.0958655016140892
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The g-formula can be used to estimate causal effects of sustained treatment strategies using observational data under the identifying assumptions of consistency, positivity, and exchangeability. The non-iterative conditional expectation (NICE) estimator of the g-formula also requires correct estimation of the conditional distribution of the time-varying treatment, confounders, and outcome. Parametric models, which have been traditionally used for this purpose, are subject to model misspecification, which may result in biased causal estimates. Here, we propose a unified deep learning framework for the NICE g-formula estimator that uses multitask recurrent neural networks for estimation of the joint conditional distributions. Using simulated data, we evaluated our model's bias and compared it with that of the parametric g-formula estimator. We found lower bias in the estimates of the causal effect of sustained treatment strategies on a survival outcome when using the deep learning estimator compared with the parametric NICE estimator in settings with simple and complex temporal dependencies between covariates. These findings suggest that our Deep Learning g-formula estimator may be less sensitive to model misspecification than the classical parametric NICE estimator when estimating the causal effect of sustained treatment strategies from complex observational data.

Related papers

A Causal Inference Framework for Data Rich Environments [17.588417435132538]
We show how classic models for potential outcomes and treatment assignments fit within our framework. For any estimator that has a fast enough estimation error rate for a certain nuisance parameter, we establish it is consistent for these various causal parameters.
arXiv Detail & Related papers (2025-04-02T13:04:26Z)
Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective. The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning. The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z)
Robust Score Matching [1.2835555561822447]
We develop a robust score matching procedure that yields consistent parameter estimates in settings where the observed data has been contaminated. A special appeal of the proposed method is that it retains convexity in exponential family models. Support recovery is studied in numerical experiments and on a precipitation dataset.
arXiv Detail & Related papers (2025-01-09T09:46:27Z)
Inference at the data's edge: Gaussian processes for modeling and inference under model-dependency, poor overlap, and extrapolation [0.0]
The Gaussian Process (GP) is a flexible non-linear regression approach. It provides a principled approach to handling our uncertainty over predicted (counterfactual) values. This is especially valuable under conditions of extrapolation or weak overlap.
arXiv Detail & Related papers (2024-07-15T05:09:50Z)
Semi-supervised Regression Analysis with Model Misspecification and High-dimensional Data [8.619243141968886]
We present an inference framework for estimating regression coefficients in conditional mean models. We develop an augmented inverse probability weighted (AIPW) method, employing regularized estimators for both propensity score (PS) and outcome regression (OR) models. Our theoretical findings are verified through extensive simulation studies and a real-world data application.
arXiv Detail & Related papers (2024-06-20T00:34:54Z)
Efficient adjustment for complex covariates: Gaining efficiency with DOPE [56.537164957672715]
We propose a framework that accommodates adjustment for any subset of information expressed by the covariates. Based on our theoretical results, we propose the Debiased Outcome-adapted Propensity Estorimator (DOPE) for efficient estimation of the average treatment effect (ATE) Our results show that the DOPE provides an efficient and robust methodology for ATE estimation in various observational settings.
arXiv Detail & Related papers (2024-02-20T13:02:51Z)
A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime. We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z)
Bayesian Cramér-Rao Bound Estimation with Score-Based Models [3.4480437706804503]
The Bayesian Cram'er-Rao bound (CRB) provides a lower bound on the mean square error of any Bayesian estimator under mild regularity conditions. This work introduces a new data-driven estimator for the CRB using score matching.
arXiv Detail & Related papers (2023-09-28T00:22:21Z)
Modeling High-Dimensional Data with Unknown Cut Points: A Fusion Penalized Logistic Threshold Regression [2.520538806201793]
In traditional logistic regression models, the link function is often assumed to be linear and continuous in predictors. We consider a threshold model that all continuous features are discretized into ordinal levels, which further determine the binary responses. We find the lasso model is well suited in the problem of early detection and prediction for chronic disease like diabetes.
arXiv Detail & Related papers (2022-02-17T04:16:40Z)
Dimension-Free Average Treatment Effect Inference with Deep Neural Networks [6.704751710867747]
This paper investigates the estimation and inference of the average treatment effect (ATE) using deep neural networks (DNNs) in the potential outcomes framework. We show that both DNN estimates of ATE are consistent with dimension-free consistency rates under some assumptions on the underlying true mean regression model.
arXiv Detail & Related papers (2021-12-02T19:28:37Z)
Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models. In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints. A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z)
Performance metrics for intervention-triggering prediction models do not reflect an expected reduction in outcomes from using the model [71.9860741092209]
Clinical researchers often select among and evaluate risk prediction models. Standard metrics calculated from retrospective data are only related to model utility under certain assumptions. When predictions are delivered repeatedly throughout time, the relationship between standard metrics and utility is further complicated.
arXiv Detail & Related papers (2020-06-02T16:26:49Z)
Nonparametric Score Estimators [49.42469547970041]
Estimating the score from a set of samples generated by an unknown distribution is a fundamental task in inference and learning of probabilistic models. We provide a unifying view of these estimators under the framework of regularized nonparametric regression. We propose score estimators based on iterative regularization that enjoy computational benefits from curl-free kernels and fast convergence.
arXiv Detail & Related papers (2020-05-20T15:01:03Z)
Machine learning for causal inference: on the use of cross-fit estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties. We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE) When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.