Generalized Independent Noise Condition for Estimating Latent Variable
Causal Graphs
- URL: http://arxiv.org/abs/2010.04917v2
- Date: Wed, 18 Nov 2020 15:21:06 GMT
- Title: Generalized Independent Noise Condition for Estimating Latent Variable
Causal Graphs
- Authors: Feng Xie, Ruichu Cai, Biwei Huang, Clark Glymour, Zhifeng Hao, Kun
Zhang
- Abstract summary: We propose a Generalized Independent Noise (GIN) condition to estimate latent variable graphs.
We show that GIN helps locate latent variables and identify their causal structure, including causal directions.
- Score: 39.24319581164022
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Causal discovery aims to recover causal structures or models underlying the
observed data. Despite its success in certain domains, most existing methods
focus on causal relations between observed variables, while in many scenarios
the observed ones may not be the underlying causal variables (e.g., image
pixels), but are generated by latent causal variables or confounders that are
causally related. To this end, in this paper, we consider Linear, Non-Gaussian
Latent variable Models (LiNGLaMs), in which latent confounders are also
causally related, and propose a Generalized Independent Noise (GIN) condition
to estimate such latent variable graphs. Specifically, for two observed random
vectors $\mathbf{Y}$ and $\mathbf{Z}$, GIN holds if and only if
$\omega^{\intercal}\mathbf{Y}$ and $\mathbf{Z}$ are statistically independent,
where $\omega$ is a parameter vector characterized from the cross-covariance
between $\mathbf{Y}$ and $\mathbf{Z}$. From the graphical view, roughly
speaking, GIN implies that causally earlier latent common causes of variables
in $\mathbf{Y}$ d-separate $\mathbf{Y}$ from $\mathbf{Z}$. Interestingly, we
find that the independent noise condition, i.e., if there is no confounder,
causes are independent from the error of regressing the effect on the causes,
can be seen as a special case of GIN. Moreover, we show that GIN helps locate
latent variables and identify their causal structure, including causal
directions. We further develop a recursive learning algorithm to achieve these
goals. Experimental results on synthetic and real-world data demonstrate the
effectiveness of our method.
Related papers
- Causal Representation Learning from Multiple Distributions: A General Setting [21.73088044465267]
This paper is concerned with a general, completely nonparametric setting of causal representation learning from multiple distributions.
We show that under the sparsity constraint on the recovered graph over the latent variables and suitable sufficient change conditions on the causal influences, one can recover the moralized graph of the underlying directed acyclic graph.
In some cases, most latent variables can even be recovered up to component-wise transformations.
arXiv Detail & Related papers (2024-02-07T17:51:38Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - Generalized Independent Noise Condition for Estimating Causal Structure with Latent Variables [28.44175079713669]
We propose a Generalized Independent Noise (GIN) condition for linear non-Gaussian acyclic causal models.
We show that the causal structure of a LiNGLaH is identifiable in light of GIN conditions.
arXiv Detail & Related papers (2023-08-13T08:13:34Z) - Reinterpreting causal discovery as the task of predicting unobserved
joint statistics [15.088547731564782]
We argue that causal discovery can help inferring properties of the unobserved joint distributions'
We define a learning scenario where the input is a subset of variables and the label is some statistical property of that subset.
arXiv Detail & Related papers (2023-05-11T15:30:54Z) - Statistical Learning under Heterogeneous Distribution Shift [71.8393170225794]
Ground-truth predictor is additive $mathbbE[mathbfz mid mathbfx,mathbfy] = f_star(mathbfx) +g_star(mathbfy)$.
arXiv Detail & Related papers (2023-02-27T16:34:21Z) - On the Identifiability and Estimation of Causal Location-Scale Noise
Models [122.65417012597754]
We study the class of location-scale or heteroscedastic noise models (LSNMs)
We show the causal direction is identifiable up to some pathological cases.
We propose two estimators for LSNMs: an estimator based on (non-linear) feature maps, and one based on neural networks.
arXiv Detail & Related papers (2022-10-13T17:18:59Z) - Statistical limits of correlation detection in trees [0.7826806223782055]
This paper addresses the problem of testing whether two observed trees $(t,t')$ are sampled independently or from a joint distribution under which they are correlated.
Motivated by graph alignment, we investigate the conditions of existence of one-sided tests.
We find that no such test exists for $s leq sqrtalpha$, and that such a test exists whenever $s > sqrtalpha$, for $lambda$ large enough.
arXiv Detail & Related papers (2022-09-27T22:26:53Z) - Causal Inference Despite Limited Global Confounding via Mixture Models [4.721845865189578]
A finite $k$-mixture of such models is graphically represented by a larger graph.
We give the first algorithm to learn mixtures of non-empty DAGs.
arXiv Detail & Related papers (2021-12-22T01:04:50Z) - Causal Expectation-Maximisation [70.45873402967297]
We show that causal inference is NP-hard even in models characterised by polytree-shaped graphs.
We introduce the causal EM algorithm to reconstruct the uncertainty about the latent variables from data about categorical manifest variables.
We argue that there appears to be an unnoticed limitation to the trending idea that counterfactual bounds can often be computed without knowledge of the structural equations.
arXiv Detail & Related papers (2020-11-04T10:25:13Z) - Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.