Reinterpreting causal discovery as the task of predicting unobserved
joint statistics
- URL: http://arxiv.org/abs/2305.06894v1
- Date: Thu, 11 May 2023 15:30:54 GMT
- Title: Reinterpreting causal discovery as the task of predicting unobserved
joint statistics
- Authors: Dominik Janzing, Philipp M. Faller, Leena Chennuru Vankadara
- Abstract summary: We argue that causal discovery can help inferring properties of the unobserved joint distributions'
We define a learning scenario where the input is a subset of variables and the label is some statistical property of that subset.
- Score: 15.088547731564782
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: If $X,Y,Z$ denote sets of random variables, two different data sources may
contain samples from $P_{X,Y}$ and $P_{Y,Z}$, respectively. We argue that
causal discovery can help inferring properties of the `unobserved joint
distributions' $P_{X,Y,Z}$ or $P_{X,Z}$. The properties may be conditional
independences (as in `integrative causal inference') or also quantitative
statements about dependences.
More generally, we define a learning scenario where the input is a subset of
variables and the label is some statistical property of that subset. Sets of
jointly observed variables define the training points, while unobserved sets
are possible test points. To solve this learning task, we infer, as an
intermediate step, a causal model from the observations that then entails
properties of unobserved sets. Accordingly, we can define the VC dimension of a
class of causal models and derive generalization bounds for the predictions.
Here, causal discovery becomes more modest and better accessible to empirical
tests than usual: rather than trying to find a causal hypothesis that is `true'
a causal hypothesis is {\it useful} whenever it correctly predicts statistical
properties of unobserved joint distributions. This way, a sparse causal graph
that omits weak influences may be more useful than a dense one (despite being
less accurate) because it is able to reconstruct the full joint distribution
from marginal distributions of smaller subsets.
Within such a `pragmatic' application of causal discovery, some popular
heuristic approaches become justified in retrospect. It is, for instance,
allowed to infer DAGs from partial correlations instead of conditional
independences if the DAGs are only used to predict partial correlations.
Related papers
- Causal Representation Learning from Multiple Distributions: A General Setting [21.73088044465267]
This paper is concerned with a general, completely nonparametric setting of causal representation learning from multiple distributions.
We show that under the sparsity constraint on the recovered graph over the latent variables and suitable sufficient change conditions on the causal influences, one can recover the moralized graph of the underlying directed acyclic graph.
In some cases, most latent variables can even be recovered up to component-wise transformations.
arXiv Detail & Related papers (2024-02-07T17:51:38Z) - Invariant Causal Prediction with Local Models [52.161513027831646]
We consider the task of identifying the causal parents of a target variable among a set of candidates from observational data.
We introduce a practical method called L-ICP ($textbfL$ocalized $textbfI$nvariant $textbfCa$usal $textbfP$rediction), which is based on a hypothesis test for parent identification using a ratio of minimum and maximum statistics.
arXiv Detail & Related papers (2024-01-10T15:34:42Z) - Nonparametric Identifiability of Causal Representations from Unknown
Interventions [63.1354734978244]
We study causal representation learning, the task of inferring latent causal variables and their causal relations from mixtures of the variables.
Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data.
arXiv Detail & Related papers (2023-06-01T10:51:58Z) - Distinguishing Cause from Effect on Categorical Data: The Uniform
Channel Model [0.0]
Distinguishing cause from effect using observations of a pair of random variables is a core problem in causal discovery.
We propose a criterion to address the cause-effect problem with categorical variables.
We select as the most likely causal direction the one in which the conditional probability mass function is closer to a uniform channel (UC)
arXiv Detail & Related papers (2023-03-14T13:54:11Z) - Identifiability of Sparse Causal Effects using Instrumental Variables [11.97552507834888]
In this paper, we consider linear models in which the causal effect from covariables $X$ on a response $Y$ is sparse.
We provide conditions under which the causal coefficient becomes identifiable from the observed distribution.
As an estimator, we propose spaceIV and prove that it consistently estimates the causal effect if the model is identifiable.
arXiv Detail & Related papers (2022-03-17T15:15:52Z) - Decoding Causality by Fictitious VAR Modeling [0.0]
We first set up an equilibrium for the cause-effect relations using a fictitious vector autoregressive model.
In the equilibrium, long-run relations are identified from noise, and spurious ones are negligibly close to zero.
We also apply the approach to estimating the causal factors' contribution to climate change.
arXiv Detail & Related papers (2021-11-14T22:43:02Z) - Variance Minimization in the Wasserstein Space for Invariant Causal
Prediction [72.13445677280792]
In this work, we show that the approach taken in ICP may be reformulated as a series of nonparametric tests that scales linearly in the number of predictors.
Each of these tests relies on the minimization of a novel loss function that is derived from tools in optimal transport theory.
We prove under mild assumptions that our method is able to recover the set of identifiable direct causes, and we demonstrate in our experiments that it is competitive with other benchmark causal discovery algorithms.
arXiv Detail & Related papers (2021-10-13T22:30:47Z) - Counterfactual Invariance to Spurious Correlations: Why and How to Pass
Stress Tests [87.60900567941428]
A spurious correlation' is the dependence of a model on some aspect of the input data that an analyst thinks shouldn't matter.
In machine learning, these have a know-it-when-you-see-it character.
We study stress testing using the tools of causal inference.
arXiv Detail & Related papers (2021-05-31T14:39:38Z) - Causal Expectation-Maximisation [70.45873402967297]
We show that causal inference is NP-hard even in models characterised by polytree-shaped graphs.
We introduce the causal EM algorithm to reconstruct the uncertainty about the latent variables from data about categorical manifest variables.
We argue that there appears to be an unnoticed limitation to the trending idea that counterfactual bounds can often be computed without knowledge of the structural equations.
arXiv Detail & Related papers (2020-11-04T10:25:13Z) - Latent Causal Invariant Model [128.7508609492542]
Current supervised learning can learn spurious correlation during the data-fitting process.
We propose a Latent Causal Invariance Model (LaCIM) which pursues causal prediction.
arXiv Detail & Related papers (2020-11-04T10:00:27Z) - Information-Theoretic Approximation to Causal Models [0.0]
We show that it is possible to solve the problem of inferring the causal direction and causal effect between two random variables from a finite sample.
We embed distributions that originate from samples of X and Y into a higher dimensional probability space.
We show that this information-theoretic approximation to causal models (IACM) can be done by solving a linear optimization problem.
arXiv Detail & Related papers (2020-07-29T18:34:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.