Federated Causal Inference: Multi-Centric ATE Estimation beyond Meta-Analysis
- URL: http://arxiv.org/abs/2410.16870v1
- Date: Tue, 22 Oct 2024 10:19:17 GMT
- Title: Federated Causal Inference: Multi-Centric ATE Estimation beyond Meta-Analysis
- Authors: Rémi Khellaf, Aurélien Bellet, Julie Josse,
- Abstract summary: We study Federated Causal Inference, an approach to estimate treatment effects from decentralized data across centers.
We compare three classes of Average Treatment Effect (ATE) estimators derived from the Plug-in G-Formula.
- Score: 12.896319628045967
- License:
- Abstract: We study Federated Causal Inference, an approach to estimate treatment effects from decentralized data across centers. We compare three classes of Average Treatment Effect (ATE) estimators derived from the Plug-in G-Formula, ranging from simple meta-analysis to one-shot and multi-shot federated learning, the latter leveraging the full data to learn the outcome model (albeit requiring more communication). Focusing on Randomized Controlled Trials (RCTs), we derive the asymptotic variance of these estimators for linear models. Our results provide practical guidance on selecting the appropriate estimator for various scenarios, including heterogeneity in sample sizes, covariate distributions, treatment assignment schemes, and center effects. We validate these findings with a simulation study.
Related papers
- Robust CATE Estimation Using Novel Ensemble Methods [0.8246494848934447]
estimation of Conditional Average Treatment Effects (CATE) is crucial for understanding the heterogeneity of treatment effects in clinical trials.
We evaluate the performance of common methods, including causal forests and various meta-learners, across a diverse set of scenarios.
We propose two new ensemble methods that integrate multiple estimators to enhance prediction stability and performance.
arXiv Detail & Related papers (2024-07-04T07:23:02Z) - Bayesian Federated Inference for regression models based on non-shared multicenter data sets from heterogeneous populations [0.0]
In a regression model, the sample size must be large enough relative to the number of possible predictors.
Pooling data from different data sets collected in different (medical) centers would alleviate this problem, but is often not feasible due to privacy regulation or logistic problems.
An alternative route would be to analyze the local data in the centers separately and combine the statistical inference results with the Bayesian Federated Inference (BFI) methodology.
The aim of this approach is to compute from the inference results in separate centers what would have been found if the statistical analysis was performed on the combined data.
arXiv Detail & Related papers (2024-02-05T11:10:27Z) - Individualized Multi-Treatment Response Curves Estimation using RBF-net with Shared Neurons [1.1119247609126184]
Our non-parametric modeling of the response curves relies on radial basis function (RBF)-nets with shared hidden neurons.
Applying our proposed method to MIMIC data, we obtain several interesting findings related to the impact of different treatment strategies on the length of ICU stay and 12-hour SOFA score for sepsis patients who are home-discharged.
arXiv Detail & Related papers (2024-01-29T21:13:01Z) - Continuous Treatment Effect Estimation Using Gradient Interpolation and
Kernel Smoothing [43.259723628010896]
We advocate the direct approach of augmenting training individuals with independently sampled treatments and inferred counterfactual outcomes.
We evaluate our method on five benchmarks and show that our method outperforms six state-of-the-art methods on the counterfactual estimation error.
arXiv Detail & Related papers (2024-01-27T15:52:58Z) - Few-shot learning for COVID-19 Chest X-Ray Classification with
Imbalanced Data: An Inter vs. Intra Domain Study [49.5374512525016]
Medical image datasets are essential for training models used in computer-aided diagnosis, treatment planning, and medical research.
Some challenges are associated with these datasets, including variability in data distribution, data scarcity, and transfer learning issues when using models pre-trained from generic images.
We propose a methodology based on Siamese neural networks in which a series of techniques are integrated to mitigate the effects of data scarcity and distribution imbalance.
arXiv Detail & Related papers (2024-01-18T16:59:27Z) - Using representation balancing to learn conditional-average dose responses from clustered data [5.633848204699653]
Estimating a unit's responses to interventions with an associated dose is relevant in a variety of domains.
We show the impacts of clustered data on model performance and propose an estimator, CBRNet.
arXiv Detail & Related papers (2023-09-07T14:17:44Z) - Ensemble Modeling for Multimodal Visual Action Recognition [50.38638300332429]
We propose an ensemble modeling approach for multimodal action recognition.
We independently train individual modality models using a variant of focal loss tailored to handle the long-tailed distribution of the MECCANO [21] dataset.
arXiv Detail & Related papers (2023-08-10T08:43:20Z) - Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues.
We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders.
We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z) - Adversarial Sample Enhanced Domain Adaptation: A Case Study on
Predictive Modeling with Electronic Health Records [57.75125067744978]
We propose a data augmentation method to facilitate domain adaptation.
adversarially generated samples are used during domain adaptation.
Results confirm the effectiveness of our method and the generality on different tasks.
arXiv Detail & Related papers (2021-01-13T03:20:20Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z) - Generalization Bounds and Representation Learning for Estimation of
Potential Outcomes and Causal Effects [61.03579766573421]
We study estimation of individual-level causal effects, such as a single patient's response to alternative medication.
We devise representation learning algorithms that minimize our bound, by regularizing the representation's induced treatment group distance.
We extend these algorithms to simultaneously learn a weighted representation to further reduce treatment group distances.
arXiv Detail & Related papers (2020-01-21T10:16:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.