Federated Causal Discovery from Heterogeneous Data
- URL: http://arxiv.org/abs/2402.13241v2
- Date: Tue, 27 Feb 2024 04:45:47 GMT
- Title: Federated Causal Discovery from Heterogeneous Data
- Authors: Loka Li, Ignavier Ng, Gongxu Luo, Biwei Huang, Guangyi Chen, Tongliang
Liu, Bin Gu, Kun Zhang
- Abstract summary: We propose a novel FCD method attempting to accommodate arbitrary causal models and heterogeneous data.
These approaches involve constructing summary statistics as a proxy of the raw data to protect data privacy.
We conduct extensive experiments on synthetic and real datasets to show the efficacy of our method.
- Score: 70.31070224690399
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Conventional causal discovery methods rely on centralized data, which is
inconsistent with the decentralized nature of data in many real-world
situations. This discrepancy has motivated the development of federated causal
discovery (FCD) approaches. However, existing FCD methods may be limited by
their potentially restrictive assumptions of identifiable functional causal
models or homogeneous data distributions, narrowing their applicability in
diverse scenarios. In this paper, we propose a novel FCD method attempting to
accommodate arbitrary causal models and heterogeneous data. We first utilize a
surrogate variable corresponding to the client index to account for the data
heterogeneity across different clients. We then develop a federated conditional
independence test (FCIT) for causal skeleton discovery and establish a
federated independent change principle (FICP) to determine causal directions.
These approaches involve constructing summary statistics as a proxy of the raw
data to protect data privacy. Owing to the nonparametric properties, FCIT and
FICP make no assumption about particular functional forms, thereby facilitating
the handling of arbitrary causal models. We conduct extensive experiments on
synthetic and real datasets to show the efficacy of our method. The code is
available at https://github.com/lokali/FedCDH.git.
Related papers
- DAGnosis: Localized Identification of Data Inconsistencies using
Structures [73.39285449012255]
Identification and appropriate handling of inconsistencies in data at deployment time is crucial to reliably use machine learning models.
We use directed acyclic graphs (DAGs) to encode the training set's features probability distribution and independencies as a structure.
Our method, called DAGnosis, leverages these structural interactions to bring valuable and insightful data-centric conclusions.
arXiv Detail & Related papers (2024-02-26T11:29:16Z) - Efficient Conformal Prediction under Data Heterogeneity [79.35418041861327]
Conformal Prediction (CP) stands out as a robust framework for uncertainty quantification.
Existing approaches for tackling non-exchangeability lead to methods that are not computable beyond the simplest examples.
This work introduces a new efficient approach to CP that produces provably valid confidence sets for fairly general non-exchangeable data distributions.
arXiv Detail & Related papers (2023-12-25T20:02:51Z) - Discovering Mixtures of Structural Causal Models from Time Series Data [23.18511951330646]
We propose a general variational inference-based framework called MCD to infer the underlying causal models.
Our approach employs an end-to-end training process that maximizes an evidence-lower bound for the data likelihood.
We demonstrate that our method surpasses state-of-the-art benchmarks in causal discovery tasks.
arXiv Detail & Related papers (2023-10-10T05:13:10Z) - Differentially Private Federated Clustering over Non-IID Data [59.611244450530315]
clustering clusters (FedC) problem aims to accurately partition unlabeled data samples distributed over massive clients into finite clients under the orchestration of a server.
We propose a novel FedC algorithm using differential privacy convergence technique, referred to as DP-Fed, in which partial participation and multiple clients are also considered.
Various attributes of the proposed DP-Fed are obtained through theoretical analyses of privacy protection, especially for the case of non-identically and independently distributed (non-i.i.d.) data.
arXiv Detail & Related papers (2023-01-03T05:38:43Z) - Federated Causal Discovery From Interventions [35.53403074610876]
We propose FedCDI, a framework for inferring causal structures from distributed data containing interventional samples.
In line with the federated learning framework, FedCDI improves privacy by exchanging belief updates rather than raw samples.
arXiv Detail & Related papers (2022-11-07T20:25:48Z) - Differentiable Invariant Causal Discovery [106.87950048845308]
Learning causal structure from observational data is a fundamental challenge in machine learning.
This paper proposes Differentiable Invariant Causal Discovery (DICD) to avoid learning spurious edges and wrong causal directions.
Extensive experiments on synthetic and real-world datasets verify that DICD outperforms state-of-the-art causal discovery methods up to 36% in SHD.
arXiv Detail & Related papers (2022-05-31T09:29:07Z) - Federated Causal Discovery [74.37739054932733]
This paper develops a gradient-based learning framework named DAG-Shared Federated Causal Discovery (DS-FCD)
It can learn the causal graph without directly touching local data and naturally handle the data heterogeneity.
Extensive experiments on both synthetic and real-world datasets verify the efficacy of the proposed method.
arXiv Detail & Related papers (2021-12-07T08:04:12Z) - A Subsampling-Based Method for Causal Discovery on Discrete Data [18.35147325731821]
In this work, we propose a subsampling-based method to test the independence between the generating schemes of the cause and that of the mechanism.
Our methodology works for both discrete and categorical data and does not imply any functional model on the data, making it a more flexible approach.
arXiv Detail & Related papers (2021-08-31T17:11:58Z) - Federated Estimation of Causal Effects from Observational Data [19.657789891394504]
We present a novel framework for causal inference with federated data sources.
We assess and integrate local causal effects from different private data sources without centralizing them.
arXiv Detail & Related papers (2021-05-31T08:06:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.