A Tree-based Federated Learning Approach for Personalized Treatment
Effect Estimation from Heterogeneous Data Sources
- URL: http://arxiv.org/abs/2103.06261v1
- Date: Wed, 10 Mar 2021 18:51:30 GMT
- Title: A Tree-based Federated Learning Approach for Personalized Treatment
Effect Estimation from Heterogeneous Data Sources
- Authors: Xiaoqing Tan, Chung-Chou H. Chang, Lu Tang
- Abstract summary: Federated learning is an appealing framework for analyzing sensitive data from distributed health data networks.
We develop an efficient and interpretable tree-based ensemble of personalized treatment effect estimators to join results across hospital sites.
- Score: 5.049057348282933
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning is an appealing framework for analyzing sensitive data
from distributed health data networks due to its protection of data privacy.
Under this framework, data partners at local sites collaboratively build an
analytical model under the orchestration of a coordinating site, while keeping
the data decentralized. However, existing federated learning methods mainly
assume data across sites are homogeneous samples of the global population,
hence failing to properly account for the extra variability across sites in
estimation and inference. Drawing on a multi-hospital electronic health records
network, we develop an efficient and interpretable tree-based ensemble of
personalized treatment effect estimators to join results across hospital sites,
while actively modeling for the heterogeneity in data sources through site
partitioning. The efficiency of our method is demonstrated by a study of causal
effects of oxygen saturation on hospital mortality and backed up by
comprehensive numerical results.
Related papers
- Federated Impression for Learning with Distributed Heterogeneous Data [19.50235109938016]
Federated learning (FL) provides a paradigm that can learn from distributed datasets across clients without requiring them to share data.
In FL, sub-optimal convergence is common among data from different health centers due to the variety in data collection protocols and patient demographics across centers.
We propose FedImpres which alleviates catastrophic forgetting by restoring synthetic data that represents the global information as federated impression.
arXiv Detail & Related papers (2024-09-11T15:37:52Z) - Addressing Data Heterogeneity in Federated Learning of Cox Proportional Hazards Models [8.798959872821962]
This paper outlines an approach in the domain of federated survival analysis, specifically the Cox Proportional Hazards (CoxPH) model.
We present an FL approach that employs feature-based clustering to enhance model accuracy across synthetic datasets and real-world applications.
arXiv Detail & Related papers (2024-07-20T18:34:20Z) - On the Impact of Data Heterogeneity in Federated Learning Environments with Application to Healthcare Networks [3.9058850780464884]
Federated Learning (FL) allows privacy-sensitive applications to leverage their dataset for a global model construction without any disclosure of the information.
One of those domains is healthcare, where groups of silos collaborate in order to generate a global predictor with improved accuracy and generalization.
This paper presents a comprehensive exploration of the mathematical formalization and taxonomy of heterogeneity within FL environments, focusing on the intricacies of medical data.
arXiv Detail & Related papers (2024-04-29T09:05:01Z) - Few-shot learning for COVID-19 Chest X-Ray Classification with
Imbalanced Data: An Inter vs. Intra Domain Study [49.5374512525016]
Medical image datasets are essential for training models used in computer-aided diagnosis, treatment planning, and medical research.
Some challenges are associated with these datasets, including variability in data distribution, data scarcity, and transfer learning issues when using models pre-trained from generic images.
We propose a methodology based on Siamese neural networks in which a series of techniques are integrated to mitigate the effects of data scarcity and distribution imbalance.
arXiv Detail & Related papers (2024-01-18T16:59:27Z) - Federated Offline Reinforcement Learning [55.326673977320574]
We propose a multi-site Markov decision process model that allows for both homogeneous and heterogeneous effects across sites.
We design the first federated policy optimization algorithm for offline RL with sample complexity.
We give a theoretical guarantee for the proposed algorithm, where the suboptimality for the learned policies is comparable to the rate as if data is not distributed.
arXiv Detail & Related papers (2022-06-11T18:03:26Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - Federated Causal Inference in Heterogeneous Observational Data [13.460660554484512]
We are interested in estimating the effect of a treatment applied to individuals at multiple sites, where data is stored locally for each site.
Due to privacy constraints, individual-level data cannot be shared across sites; the sites may also have heterogeneous populations and treatment assignment mechanisms.
Motivated by these considerations, we develop federated methods to draw inference on the average treatment effects of combined data across sites.
arXiv Detail & Related papers (2021-07-25T05:55:00Z) - An Experimental Study of Data Heterogeneity in Federated Learning
Methods for Medical Imaging [8.984706828657814]
Federated learning enables multiple institutions to collaboratively train machine learning models on their local data in a privacy-preserving way.
We investigate the deleterious impact of a taxonomy of data heterogeneity regimes on federated learning methods, including quantity skew, label distribution skew, and imaging acquisition skew.
We present several mitigation strategies to overcome performance drops from data heterogeneity, including weighted average for data quantity skew, weighted loss and batch normalization averaging for label distribution skew.
arXiv Detail & Related papers (2021-07-18T05:47:48Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - FLOP: Federated Learning on Medical Datasets using Partial Networks [84.54663831520853]
COVID-19 Disease due to the novel coronavirus has caused a shortage of medical resources.
Different data-driven deep learning models have been developed to mitigate the diagnosis of COVID-19.
The data itself is still scarce due to patient privacy concerns.
We propose a simple yet effective algorithm, named textbfFederated textbfL textbfon Medical datasets using textbfPartial Networks (FLOP)
arXiv Detail & Related papers (2021-02-10T01:56:58Z) - Predictive Modeling of ICU Healthcare-Associated Infections from
Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling
Approach [55.41644538483948]
This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units.
The aim is to support decision making addressed at reducing the incidence rate of infections.
arXiv Detail & Related papers (2020-05-07T16:13:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.