On the Complexity of Identification in Linear Structural Causal Models
- URL: http://arxiv.org/abs/2407.12528v1
- Date: Wed, 17 Jul 2024 13:11:26 GMT
- Title: On the Complexity of Identification in Linear Structural Causal Models
- Authors: Julian Dörfler, Benito van der Zander, Markus Bläser, Maciej Liskiewicz,
- Abstract summary: We give a new sound and complete algorithm for generic identification which runs in space.
The paper also presents evidence that identification is computationally hard in general.
- Score: 3.44747819522562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning the unknown causal parameters of a linear structural causal model is a fundamental task in causal analysis. The task, known as the problem of identification, asks to estimate the parameters of the model from a combination of assumptions on the graphical structure of the model and observational data, represented as a non-causal covariance matrix. In this paper, we give a new sound and complete algorithm for generic identification which runs in polynomial space. By standard simulation results, this algorithm has exponential running time which vastly improves the state-of-the-art double exponential time method using a Gr\"obner basis approach. The paper also presents evidence that parameter identification is computationally hard in general. In particular, we prove, that the task asking whether, for a given feasible correlation matrix, there are exactly one or two or more parameter sets explaining the observed matrix, is hard for $\forall R$, the co-class of the existential theory of the reals. In particular, this problem is $coNP$-hard. To our best knowledge, this is the first hardness result for some notion of identifiability.
Related papers
- Parameter identification in linear non-Gaussian causal models under general confounding [8.273471398838533]
We study identification of the linear coefficients when such models contain latent variables.
Our main result is a graphical criterion that is necessary and sufficient for deciding generic identifiability of direct causal effects.
We report on estimations based on the identification result, explore a generalization to models with feedback loops, and provide new results on the identifiability of the causal graph.
arXiv Detail & Related papers (2024-05-31T14:39:14Z) - Hybrid Top-Down Global Causal Discovery with Local Search for Linear and Nonlinear Additive Noise Models [2.0738462952016232]
Methods based on functional causal models can identify a unique graph, but suffer from the curse of dimensionality or impose strong parametric assumptions.
We propose a novel hybrid approach for global causal discovery in observational data that leverages local causal substructures.
We provide theoretical guarantees for correctness and worst-case time complexities, with empirical validation on synthetic data.
arXiv Detail & Related papers (2024-05-23T12:28:16Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Score-based Causal Representation Learning: Linear and General Transformations [31.786444957887472]
The paper addresses both the identifiability and achievability aspects.
It designs a score-based class of algorithms that ensures both identifiability and achievability.
Results are empirically validated via experiments on structured synthetic data and image data.
arXiv Detail & Related papers (2024-02-01T18:40:03Z) - Identification for Tree-shaped Structural Causal Models in Polynomial
Time [1.5151556900495786]
Identifying causal parameters from correlations between nodes is an open problem in artificial intelligence.
In this paper, we study SCMs whose directed component forms a tree.
We present a randomized-time algorithm, which solves the identification problem for tree-shaped SCMs.
arXiv Detail & Related papers (2023-11-23T15:26:29Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Probabilistic Simplex Component Analysis [66.30587591100566]
PRISM is a probabilistic simplex component analysis approach to identifying the vertices of a data-circumscribing simplex from data.
The problem has a rich variety of applications, the most notable being hyperspectral unmixing in remote sensing and non-negative matrix factorization in machine learning.
arXiv Detail & Related papers (2021-03-18T05:39:00Z) - Disentangling Observed Causal Effects from Latent Confounders using
Method of Moments [67.27068846108047]
We provide guarantees on identifiability and learnability under mild assumptions.
We develop efficient algorithms based on coupled tensor decomposition with linear constraints to obtain scalable and guaranteed solutions.
arXiv Detail & Related papers (2021-01-17T07:48:45Z) - Causal Expectation-Maximisation [70.45873402967297]
We show that causal inference is NP-hard even in models characterised by polytree-shaped graphs.
We introduce the causal EM algorithm to reconstruct the uncertainty about the latent variables from data about categorical manifest variables.
We argue that there appears to be an unnoticed limitation to the trending idea that counterfactual bounds can often be computed without knowledge of the structural equations.
arXiv Detail & Related papers (2020-11-04T10:25:13Z) - Semiparametric Nonlinear Bipartite Graph Representation Learning with
Provable Guarantees [106.91654068632882]
We consider the bipartite graph and formalize its representation learning problem as a statistical estimation problem of parameters in a semiparametric exponential family distribution.
We show that the proposed objective is strongly convex in a neighborhood around the ground truth, so that a gradient descent-based method achieves linear convergence rate.
Our estimator is robust to any model misspecification within the exponential family, which is validated in extensive experiments.
arXiv Detail & Related papers (2020-03-02T16:40:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.