Federated Causal Inference in Heterogeneous Observational Data
- URL: http://arxiv.org/abs/2107.11732v5
- Date: Sun, 2 Apr 2023 22:13:24 GMT
- Title: Federated Causal Inference in Heterogeneous Observational Data
- Authors: Ruoxuan Xiong, Allison Koenecke, Michael Powell, Zhu Shen, Joshua T.
Vogelstein, Susan Athey
- Abstract summary: We are interested in estimating the effect of a treatment applied to individuals at multiple sites, where data is stored locally for each site.
Due to privacy constraints, individual-level data cannot be shared across sites; the sites may also have heterogeneous populations and treatment assignment mechanisms.
Motivated by these considerations, we develop federated methods to draw inference on the average treatment effects of combined data across sites.
- Score: 13.460660554484512
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We are interested in estimating the effect of a treatment applied to
individuals at multiple sites, where data is stored locally for each site. Due
to privacy constraints, individual-level data cannot be shared across sites;
the sites may also have heterogeneous populations and treatment assignment
mechanisms. Motivated by these considerations, we develop federated methods to
draw inference on the average treatment effects of combined data across sites.
Our methods first compute summary statistics locally using propensity scores
and then aggregate these statistics across sites to obtain point and variance
estimators of average treatment effects. We show that these estimators are
consistent and asymptotically normal. To achieve these asymptotic properties,
we find that the aggregation schemes need to account for the heterogeneity in
treatment assignments and in outcomes across sites. We demonstrate the validity
of our federated methods through a comparative study of two large medical
claims databases.
Related papers
- Estimating Individual Dose-Response Curves under Unobserved Confounders from Observational Data [6.166869525631879]
We present ContiVAE, a novel framework for estimating causal effects of continuous treatments, measured by individual dose-response curves.
We show that ContiVAE outperforms existing methods by up to 62%, demonstrating its robustness and flexibility.
arXiv Detail & Related papers (2024-10-21T07:24:26Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Counterfactual Data Augmentation with Contrastive Learning [27.28511396131235]
We introduce a model-agnostic data augmentation method that imputes the counterfactual outcomes for a selected subset of individuals.
We use contrastive learning to learn a representation space and a similarity measure such that in the learned representation space close individuals identified by the learned similarity measure have similar potential outcomes.
This property ensures reliable imputation of counterfactual outcomes for the individuals with close neighbors from the alternative treatment group.
arXiv Detail & Related papers (2023-11-07T00:36:51Z) - Multiply Robust Federated Estimation of Targeted Average Treatment
Effects [0.0]
We propose a novel approach to derive valid causal inferences for a target population using multi-site data.
Our methodology incorporates transfer learning to estimate ensemble weights to combine information from source sites.
arXiv Detail & Related papers (2023-09-22T03:15:08Z) - Approximating Counterfactual Bounds while Fusing Observational, Biased
and Randomised Data Sources [64.96984404868411]
We address the problem of integrating data from multiple, possibly biased, observational and interventional studies.
We show that the likelihood of the available data has no local maxima.
We then show how the same approach can address the general case of multiple datasets.
arXiv Detail & Related papers (2023-07-31T11:28:24Z) - Combining Observational and Randomized Data for Estimating Heterogeneous
Treatment Effects [82.20189909620899]
Estimating heterogeneous treatment effects is an important problem across many domains.
Currently, most existing works rely exclusively on observational data.
We propose to estimate heterogeneous treatment effects by combining large amounts of observational data and small amounts of randomized data.
arXiv Detail & Related papers (2022-02-25T18:59:54Z) - A Tree-based Federated Learning Approach for Personalized Treatment
Effect Estimation from Heterogeneous Data Sources [5.049057348282933]
Federated learning is an appealing framework for analyzing sensitive data from distributed health data networks.
We develop an efficient and interpretable tree-based ensemble of personalized treatment effect estimators to join results across hospital sites.
arXiv Detail & Related papers (2021-03-10T18:51:30Z) - Enabling Counterfactual Survival Analysis with Balanced Representations [64.17342727357618]
Survival data are frequently encountered across diverse medical applications, i.e., drug development, risk profiling, and clinical trials.
We propose a theoretically grounded unified framework for counterfactual inference applicable to survival outcomes.
arXiv Detail & Related papers (2020-06-14T01:15:00Z) - Predictive Modeling of ICU Healthcare-Associated Infections from
Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling
Approach [55.41644538483948]
This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units.
The aim is to support decision making addressed at reducing the incidence rate of infections.
arXiv Detail & Related papers (2020-05-07T16:13:12Z) - Generalization Bounds and Representation Learning for Estimation of
Potential Outcomes and Causal Effects [61.03579766573421]
We study estimation of individual-level causal effects, such as a single patient's response to alternative medication.
We devise representation learning algorithms that minimize our bound, by regularizing the representation's induced treatment group distance.
We extend these algorithms to simultaneously learn a weighted representation to further reduce treatment group distances.
arXiv Detail & Related papers (2020-01-21T10:16:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.