Testing Causal Models with Hidden Variables in Polynomial Delay via Conditional Independencies
- URL: http://arxiv.org/abs/2409.14593v1
- Date: Sun, 22 Sep 2024 21:05:56 GMT
- Title: Testing Causal Models with Hidden Variables in Polynomial Delay via Conditional Independencies
- Authors: Hyunchai Jeong, Adiba Ejaz, Jin Tian, Elias Bareinboim,
- Abstract summary: Testing a hypothesized causal model against observational data is a key prerequisite for many causal inference tasks.
While a model can assume exponentially many conditional independence relations (CIs), testing all of them is both impractical and unnecessary.
We introduce c-LMP for causal graphs with hidden variables and develop a delay algorithm to list these CIs in poly-time intervals.
- Score: 49.99600569996907
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Testing a hypothesized causal model against observational data is a key prerequisite for many causal inference tasks. A natural approach is to test whether the conditional independence relations (CIs) assumed in the model hold in the data. While a model can assume exponentially many CIs (with respect to the number of variables), testing all of them is both impractical and unnecessary. Causal graphs, which encode these CIs in polynomial space, give rise to local Markov properties that enable model testing with a significantly smaller subset of CIs. Model testing based on local properties requires an algorithm to list the relevant CIs. However, existing algorithms for realistic settings with hidden variables and non-parametric distributions can take exponential time to produce even a single CI constraint. In this paper, we introduce the c-component local Markov property (C-LMP) for causal graphs with hidden variables. Since C-LMP can still invoke an exponential number of CIs, we develop a polynomial delay algorithm to list these CIs in poly-time intervals. To our knowledge, this is the first algorithm that enables poly-delay testing of CIs in causal graphs with hidden variables against arbitrary data distributions. Experiments on real-world and synthetic data demonstrate the practicality of our algorithm.
Related papers
- Detection of Unobserved Common Causes based on NML Code in Discrete,
Mixed, and Continuous Variables [1.5039745292757667]
We categorize all possible causal relationships between two random variables into the following four categories.
We show that CLOUD is more effective than existing methods in inferring causal relationships by extensive experiments on both synthetic and real-world data.
arXiv Detail & Related papers (2024-03-11T08:11:52Z) - Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC)
LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses.
LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z) - Signature Kernel Conditional Independence Tests in Causal Discovery for Stochastic Processes [7.103713918313219]
We develop conditional independence (CI) constraints on coordinate processes over selected intervals.
We provide a sound and complete causal discovery algorithm, capable of handling both fully and partially observed data.
We also propose a flexible, consistent signature kernel-based CI test to infer these constraints from data.
arXiv Detail & Related papers (2024-02-28T16:58:31Z) - Federated Causal Discovery from Heterogeneous Data [70.31070224690399]
We propose a novel FCD method attempting to accommodate arbitrary causal models and heterogeneous data.
These approaches involve constructing summary statistics as a proxy of the raw data to protect data privacy.
We conduct extensive experiments on synthetic and real datasets to show the efficacy of our method.
arXiv Detail & Related papers (2024-02-20T18:53:53Z) - Salesforce CausalAI Library: A Fast and Scalable Framework for Causal
Analysis of Time Series and Tabular Data [76.85310770921876]
We introduce the Salesforce CausalAI Library, an open-source library for causal analysis using observational data.
The goal of this library is to provide a fast and flexible solution for a variety of problems in the domain of causality.
arXiv Detail & Related papers (2023-01-25T22:42:48Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - A Simple Unified Approach to Testing High-Dimensional Conditional
Independences for Categorical and Ordinal Data [0.26651200086513094]
Conditional independence (CI) tests underlie many approaches to model testing and structure learning in causal inference.
Most existing CI tests for categorical and ordinal data stratify the sample by the conditioning variables, perform simple independence tests in each stratum, and combine the results.
Here we propose a simple unified CI test for ordinal and categorical data that maintains reasonable calibration and power in high dimensions.
arXiv Detail & Related papers (2022-06-09T08:56:12Z) - Partial Counterfactual Identification from Observational and
Experimental Data [83.798237968683]
We develop effective Monte Carlo algorithms to approximate the optimal bounds from an arbitrary combination of observational and experimental data.
Our algorithms are validated extensively on synthetic and real-world datasets.
arXiv Detail & Related papers (2021-10-12T02:21:30Z) - Granger Causality Based Hierarchical Time Series Clustering for State
Estimation [8.384689499720515]
Clustering is useful when working with a large volume of unlabeled data.
We propose a hierarchical time series clustering technique based on symbolic dynamic filtering and Granger causality.
A new distance metric based on Granger causality is proposed and used for the time series clustering, as well as validated on empirical data sets.
arXiv Detail & Related papers (2021-04-09T06:14:54Z) - Query Training: Learning a Worse Model to Infer Better Marginals in
Undirected Graphical Models with Hidden Variables [11.985433487639403]
Probabilistic graphical models (PGMs) provide a compact representation of knowledge that can be queried in a flexible way.
We introduce query training (QT), a mechanism to learn a PGM that is optimized for the approximate inference algorithm that will be paired with it.
We demonstrate experimentally that QT can be used to learn a challenging 8-connected grid Markov random field with hidden variables.
arXiv Detail & Related papers (2020-06-11T20:34:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.