Sample, estimate, aggregate: A recipe for causal discovery foundation models
- URL: http://arxiv.org/abs/2402.01929v3
- Date: Fri, 28 Mar 2025 19:27:51 GMT
- Title: Sample, estimate, aggregate: A recipe for causal discovery foundation models
- Authors: Menghua Wu, Yujia Bao, Regina Barzilay, Tommi Jaakkola,
- Abstract summary: Causal discovery has the potential to uncover mechanistic insights from biological experiments.<n>We propose a supervised model trained on large-scale, synthetic data to predict causal graphs.<n>Our approach is enabled by the observation that typical errors in the outputs of a discovery algorithm remain comparable across datasets.
- Score: 28.116832159265964
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Causal discovery, the task of inferring causal structure from data, has the potential to uncover mechanistic insights from biological experiments, especially those involving perturbations. However, causal discovery algorithms over larger sets of variables tend to be brittle against misspecification or when data are limited. For example, single-cell transcriptomics measures thousands of genes, but the nature of their relationships is not known, and there may be as few as tens of cells per intervention setting. To mitigate these challenges, we propose a foundation model-inspired approach: a supervised model trained on large-scale, synthetic data to predict causal graphs from summary statistics -- like the outputs of classical causal discovery algorithms run over subsets of variables and other statistical hints like inverse covariance. Our approach is enabled by the observation that typical errors in the outputs of a discovery algorithm remain comparable across datasets. Theoretically, we show that the model architecture is well-specified, in the sense that it can recover a causal graph consistent with graphs over subsets. Empirically, we train the model to be robust to misspecification and distribution shift using diverse datasets. Experiments on biological and synthetic data confirm that this model generalizes well beyond its training set, runs on graphs with hundreds of variables in seconds, and can be easily adapted to different underlying data assumptions.
Related papers
- Causal Discovery on Dependent Binary Data [6.464898093190062]
We propose a decorrelation-based approach for causal graph learning on dependent binary data.
We develop an EM-like iterative algorithm to generate and decorrelate samples of the latent utility variables.
We demonstrate that the proposed decorrelation approach significantly improves the accuracy in causal graph learning.
arXiv Detail & Related papers (2024-12-28T21:55:42Z) - Identifying perturbation targets through causal differential networks [23.568795598997376]
We propose a causality-inspired approach to identify variables responsible for changes to a biological system.
First, we infer noisy causal graphs from the observational and interventional data.
We then learn to map the differences between these graphs, along with additional statistical features, to sets of variables that were intervened upon.
arXiv Detail & Related papers (2024-10-04T12:48:21Z) - Large-Scale Targeted Cause Discovery with Data-Driven Learning [66.86881771339145]
We propose a novel machine learning approach for inferring causal variables of a target variable from observations.
By employing a local-inference strategy, our approach scales with linear complexity in the number of variables, efficiently scaling up to thousands of variables.
Empirical results demonstrate superior performance in identifying causal relationships within large-scale gene regulatory networks.
arXiv Detail & Related papers (2024-08-29T02:21:11Z) - Estimating Causal Effects from Learned Causal Networks [56.14597641617531]
We propose an alternative paradigm for answering causal-effect queries over discrete observable variables.
We learn the causal Bayesian network and its confounding latent variables directly from the observational data.
We show that this emphmodel completion learning approach can be more effective than estimand approaches.
arXiv Detail & Related papers (2024-08-26T08:39:09Z) - Demystifying amortized causal discovery with transformers [21.058343547918053]
Supervised learning approaches for causal discovery from observational data often achieve competitive performance.
In this work, we investigate CSIvA, a transformer-based model promising to train on synthetic data and transfer to real data.
We bridge the gap with existing identifiability theory and show that constraints on the training data distribution implicitly define a prior on the test observations.
arXiv Detail & Related papers (2024-05-27T08:17:49Z) - Conditional Generative Models are Sufficient to Sample from Any Causal Effect Estimand [9.460857822923842]
Causal inference from observational data plays critical role in many applications in trustworthy machine learning.
We show how to sample from any identifiable interventional distribution given an arbitrary causal graph.
We also generate high-dimensional interventional samples from the MIMIC-CXR dataset involving text and image variables.
arXiv Detail & Related papers (2024-02-12T05:48:31Z) - Discovering Mixtures of Structural Causal Models from Time Series Data [23.18511951330646]
We propose a general variational inference-based framework called MCD to infer the underlying causal models.
Our approach employs an end-to-end training process that maximizes an evidence-lower bound for the data likelihood.
We demonstrate that our method surpasses state-of-the-art benchmarks in causal discovery tasks.
arXiv Detail & Related papers (2023-10-10T05:13:10Z) - Approximating Counterfactual Bounds while Fusing Observational, Biased
and Randomised Data Sources [64.96984404868411]
We address the problem of integrating data from multiple, possibly biased, observational and interventional studies.
We show that the likelihood of the available data has no local maxima.
We then show how the same approach can address the general case of multiple datasets.
arXiv Detail & Related papers (2023-07-31T11:28:24Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - Estimation of Bivariate Structural Causal Models by Variational Gaussian
Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models.
One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z) - Improving Efficiency and Accuracy of Causal Discovery Using a
Hierarchical Wrapper [7.570246812206772]
Causal discovery from observational data is an important tool in many branches of science.
In the large sample limit, sound and complete causal discovery algorithms have been previously introduced.
However, only finite training data is available, which limits the power of statistical tests used by these algorithms.
arXiv Detail & Related papers (2021-07-11T09:24:49Z) - Identification of Latent Variables From Graphical Model Residuals [0.0]
We present a novel method to control for the latent space when estimating a DAG by iteratively deriving proxies for the latent space from the residuals of the inferred model.
We show that any improvement of prediction of an outcome is intrinsically capped and cannot rise beyond a certain limit as compared to the confounded model.
arXiv Detail & Related papers (2021-01-07T02:28:49Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Amortized Causal Discovery: Learning to Infer Causal Graphs from
Time-Series Data [63.15776078733762]
We propose Amortized Causal Discovery, a novel framework to learn to infer causal relations from time-series data.
We demonstrate experimentally that this approach, implemented as a variational model, leads to significant improvements in causal discovery performance.
arXiv Detail & Related papers (2020-06-18T19:59:12Z) - Learning Causal Models Online [103.87959747047158]
Predictive models can rely on spurious correlations in the data for making predictions.
One solution for achieving strong generalization is to incorporate causal structures in the models.
We propose an online algorithm that continually detects and removes spurious features.
arXiv Detail & Related papers (2020-06-12T20:49:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.