Federated Causal Discovery Across Heterogeneous Datasets under Latent Confounding
- URL: http://arxiv.org/abs/2603.05149v1
- Date: Thu, 05 Mar 2026 13:17:31 GMT
- Title: Federated Causal Discovery Across Heterogeneous Datasets under Latent Confounding
- Authors: Maximilian Hahn, Alina Zajak, Dominik Heider, Adèle Helena Ribeiro,
- Abstract summary: fedCI is a conditional independence test that handles heterogeneous datasets.<n> fedCI-IOD enables causal discovery under latent confounding across distributed and heterogeneous datasets.<n>Our tools are publicly available as the fedCI Python package, a privacy-preserving R implementation of IOD, and a web application for the fedCI-IOD pipeline.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Causal discovery across multiple datasets is often constrained by data privacy regulations and cross-site heterogeneity, limiting the use of conventional methods that require a single, centralized dataset. To address these challenges, we introduce fedCI, a federated conditional independence test that rigorously handles heterogeneous datasets with non-identical sets of variables, site-specific effects, and mixed variable types, including continuous, ordinal, binary, and categorical variables. At its core, fedCI uses a federated Iteratively Reweighted Least Squares (IRLS) procedure to estimate the parameters of generalized linear models underlying likelihood-ratio tests for conditional independence. Building on this, we develop fedCI-IOD, a federated extension of the Integration of Overlapping Datasets (IOD) algorithm, that replaces its meta-analysis strategy and enables, for the fist time, federated causal discovery under latent confounding across distributed and heterogeneous datasets. By aggregating evidence federatively, fedCI-IOD not only preserves privacy but also substantially enhances statistical power, achieving performance comparable to fully pooled analyses and mitigating artifacts from low local sample sizes. Our tools are publicly available as the fedCI Python package, a privacy-preserving R implementation of IOD, and a web application for the fedCI-IOD pipeline, providing versatile, user-friendly solutions for federated conditional independence testing and causal discovery.
Related papers
- Personalized Federated Dictionary Learning for Modeling Heterogeneity in Multi-site fMRI Data [14.18552770292156]
PFedDL performs independent dictionary learning at each site, decomposing each site-specific dictionary into a shared global component and a personalized local component.<n>Experiments on the ABIDE dataset demonstrate that PFedDL outperforms existing methods in accuracy and robustness across non-IID datasets.
arXiv Detail & Related papers (2025-09-25T00:01:02Z) - A Sample Efficient Conditional Independence Test in the Presence of Discretization [54.047334792855345]
Conditional Independence (CI) tests directly to discretized data can lead to incorrect conclusions.<n>Recent advancements have sought to infer the correct CI relationship between the latent variables through binarizing observed data.<n>Motivated by this, this paper introduces a sample-efficient CI test that does not rely on the binarization process.
arXiv Detail & Related papers (2025-06-10T12:41:26Z) - STSA: Federated Class-Incremental Learning via Spatial-Temporal Statistics Aggregation [64.48462746540156]
Federated Class-Incremental Learning (FCIL) enables Class-Incremental Learning from distributed data.<n>We propose a novel approach to aggregate feature statistics both spatially (across clients) and temporally (across stages)<n>We show that our method outperforms state-of-the-art FCIL methods in terms of performance, flexibility, and both communication and efficiency.
arXiv Detail & Related papers (2025-06-02T05:14:57Z) - AFCL: Analytic Federated Continual Learning for Spatio-Temporal Invariance of Non-IID Data [45.66391633579935]
Federated Continual Learning (FCL) enables distributed clients to collaboratively train a global model from online task streams.<n>FCL methods face challenges of both spatial data heterogeneity among distributed clients and temporal data heterogeneity across online tasks.<n>We propose a gradient-free method, named Analytic Federated Continual Learning (AFCL), by deriving analytical (i.e., closed-form) solutions from frozen extracted features.
arXiv Detail & Related papers (2025-05-18T05:55:09Z) - FedMAC: Tackling Partial-Modality Missing in Federated Learning with Cross-Modal Aggregation and Contrastive Regularization [18.276480518649404]
Federated Learning (FL) is a method for training machine learning models using distributed data sources.<n>This study proposes a novel framework named FedMAC, designed to address multi-modality missing under conditions of partial-modality missing in FL.
arXiv Detail & Related papers (2024-10-04T01:24:02Z) - Geometry-Aware Instrumental Variable Regression [56.16884466478886]
We propose a transport-based IV estimator that takes into account the geometry of the data manifold through data-derivative information.
We provide a simple plug-and-play implementation of our method that performs on par with related estimators in standard settings.
arXiv Detail & Related papers (2024-05-19T17:49:33Z) - Federated Causal Discovery from Heterogeneous Data [70.31070224690399]
We propose a novel FCD method attempting to accommodate arbitrary causal models and heterogeneous data.
These approaches involve constructing summary statistics as a proxy of the raw data to protect data privacy.
We conduct extensive experiments on synthetic and real datasets to show the efficacy of our method.
arXiv Detail & Related papers (2024-02-20T18:53:53Z) - Source-Free Collaborative Domain Adaptation via Multi-Perspective
Feature Enrichment for Functional MRI Analysis [55.03872260158717]
Resting-state MRI functional (rs-fMRI) is increasingly employed in multi-site research to aid neurological disorder analysis.
Many methods have been proposed to reduce fMRI heterogeneity between source and target domains.
But acquiring source data is challenging due to concerns and/or data storage burdens in multi-site studies.
We design a source-free collaborative domain adaptation framework for fMRI analysis, where only a pretrained source model and unlabeled target data are accessible.
arXiv Detail & Related papers (2023-08-24T01:30:18Z) - Robustness and Personalization in Federated Learning: A Unified Approach
via Regularization [4.7234844467506605]
We present a class of methods for robust, personalized federated learning, called Fed+.
The principal advantage of Fed+ is to better accommodate the real-world characteristics found in federated training.
We demonstrate the benefits of Fed+ through extensive experiments on benchmark datasets.
arXiv Detail & Related papers (2020-09-14T10:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.