Causal de Finetti: On the Identification of Invariant Causal Structure in Exchangeable Data
- URL: http://arxiv.org/abs/2203.15756v3
- Date: Fri, 24 May 2024 12:12:57 GMT
- Title: Causal de Finetti: On the Identification of Invariant Causal Structure in Exchangeable Data
- Authors: Siyuan Guo, Viktor Tóth, Bernhard Schölkopf, Ferenc Huszár,
- Abstract summary: Constraint-based causal discovery methods leverage conditional independence tests to infer causal relationships in a wide variety of applications.
We show that exchangeable data contains richer conditional independence structure than i.i.d.$ $ data, and show how the richer structure can be leveraged for causal discovery.
- Score: 45.389985793060674
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Constraint-based causal discovery methods leverage conditional independence tests to infer causal relationships in a wide variety of applications. Just as the majority of machine learning methods, existing work focuses on studying $\textit{independent and identically distributed}$ data. However, it is known that even with infinite i.i.d.$\ $ data, constraint-based methods can only identify causal structures up to broad Markov equivalence classes, posing a fundamental limitation for causal discovery. In this work, we observe that exchangeable data contains richer conditional independence structure than i.i.d.$\ $ data, and show how the richer structure can be leveraged for causal discovery. We first present causal de Finetti theorems, which state that exchangeable distributions with certain non-trivial conditional independences can always be represented as $\textit{independent causal mechanism (ICM)}$ generative processes. We then present our main identifiability theorem, which shows that given data from an ICM generative process, its unique causal structure can be identified through performing conditional independence tests. We finally develop a causal discovery algorithm and demonstrate its applicability to inferring causal relationships from multi-environment data. Our code and models are publicly available at: https://github.com/syguo96/Causal-de-Finetti
Related papers
- Identifying General Mechanism Shifts in Linear Causal Representations [58.6238439611389]
We consider the linear causal representation learning setting where we observe a linear mixing of $d$ unknown latent factors.
Recent work has shown that it is possible to recover the latent factors as well as the underlying structural causal model over them.
We provide a surprising identifiability result that it is indeed possible, under some very mild standard assumptions, to identify the set of shifted nodes.
arXiv Detail & Related papers (2024-10-31T15:56:50Z) - Large Language Models for Constrained-Based Causal Discovery [4.858756226945995]
Causality is essential for understanding complex systems, such as the economy, the brain, and the climate.
This work explores the capabilities of Large Language Models (LLMs) as an alternative to domain experts for causal graph generation.
arXiv Detail & Related papers (2024-06-11T15:45:24Z) - Causal Discovery from Poisson Branching Structural Causal Model Using High-Order Cumulant with Path Analysis [24.826219353338132]
One of the most common characteristics of count data is the inherent branching structure described by a binomial thinning operator.
A single causal pair is Markov equivalent, i.e., $Xrightarrow Y$ and $Yrightarrow X$ are distributed equivalent.
We propose a Poisson Branching Structure Causal Model (PB-SCM) and perform a path analysis on PB-SCM using high-order cumulants.
arXiv Detail & Related papers (2024-03-25T08:06:08Z) - Federated Causal Discovery from Heterogeneous Data [70.31070224690399]
We propose a novel FCD method attempting to accommodate arbitrary causal models and heterogeneous data.
These approaches involve constructing summary statistics as a proxy of the raw data to protect data privacy.
We conduct extensive experiments on synthetic and real datasets to show the efficacy of our method.
arXiv Detail & Related papers (2024-02-20T18:53:53Z) - Representation Disentaglement via Regularization by Causal
Identification [3.9160947065896803]
We propose the use of a causal collider structured model to describe the underlying data generative process assumptions in disentangled representation learning.
For this, we propose regularization by identification (ReI), a modular regularization engine designed to align the behavior of large scale generative models with the disentanglement constraints imposed by causal identification.
arXiv Detail & Related papers (2023-02-28T23:18:54Z) - Differentiable Invariant Causal Discovery [106.87950048845308]
Learning causal structure from observational data is a fundamental challenge in machine learning.
This paper proposes Differentiable Invariant Causal Discovery (DICD) to avoid learning spurious edges and wrong causal directions.
Extensive experiments on synthetic and real-world datasets verify that DICD outperforms state-of-the-art causal discovery methods up to 36% in SHD.
arXiv Detail & Related papers (2022-05-31T09:29:07Z) - Causal Domain Adaptation with Copula Entropy based Conditional
Independence Test [2.3980064191633232]
Domain Adaptation (DA) is a typical problem in machine learning that aims to transfer the model trained on source domain to target domain with different distribution.
We first present a mathemetical model for causal DA problem and then propose a method for causal DA that finds the invariant representation across domains.
arXiv Detail & Related papers (2022-02-27T23:32:44Z) - Disentangling Observed Causal Effects from Latent Confounders using
Method of Moments [67.27068846108047]
We provide guarantees on identifiability and learnability under mild assumptions.
We develop efficient algorithms based on coupled tensor decomposition with linear constraints to obtain scalable and guaranteed solutions.
arXiv Detail & Related papers (2021-01-17T07:48:45Z) - Causal Expectation-Maximisation [70.45873402967297]
We show that causal inference is NP-hard even in models characterised by polytree-shaped graphs.
We introduce the causal EM algorithm to reconstruct the uncertainty about the latent variables from data about categorical manifest variables.
We argue that there appears to be an unnoticed limitation to the trending idea that counterfactual bounds can often be computed without knowledge of the structural equations.
arXiv Detail & Related papers (2020-11-04T10:25:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.