Stable Differentiable Causal Discovery
- URL: http://arxiv.org/abs/2311.10263v2
- Date: Thu, 27 Jun 2024 15:11:45 GMT
- Title: Stable Differentiable Causal Discovery
- Authors: Achille Nazaret, Justin Hong, Elham Azizi, David Blei,
- Abstract summary: We propose Stable Differentiable Causal Discovery (SDCD) for referring causal relationships as directed acyclic graphs (DAGs)
We first derive SDCD and prove its stability and correctness. We then evaluate it with both observational and interventional data and on both small-scale and large-scale settings.
We find that SDCD outperforms existing methods in both convergence speed and accuracy and can scale to thousands of variables.
- Score: 2.0249250133493195
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Inferring causal relationships as directed acyclic graphs (DAGs) is an important but challenging problem. Differentiable Causal Discovery (DCD) is a promising approach to this problem, framing the search as a continuous optimization. But existing DCD methods are numerically unstable, with poor performance beyond tens of variables. In this paper, we propose Stable Differentiable Causal Discovery (SDCD), a new method that improves previous DCD methods in two ways: (1) It employs an alternative constraint for acyclicity; this constraint is more stable, both theoretically and empirically, and fast to compute. (2) It uses a training procedure tailored for sparse causal graphs, which are common in real-world scenarios. We first derive SDCD and prove its stability and correctness. We then evaluate it with both observational and interventional data and on both small-scale and large-scale settings. We find that SDCD outperforms existing methods in both convergence speed and accuracy and can scale to thousands of variables. We provide code at https://github.com/azizilab/sdcd.
Related papers
- Scalable Dual Coordinate Descent for Kernel Methods [0.0]
We develop scalable Dual Coordinate Descent (DCD) and Block Dual Coordinate Descent (BDCD) methods for kernel support vector machines (K-SVM) and kernel ridge regression (K-RR) problems.
We deriving $s$-step variants of DCD and BDCD for solving the K-SVM and K-RR problems, respectively.
The new $s$-step variants achieved strong scaling speedups of up to $9.8times$ over existing methods using up to $512$ cores.
arXiv Detail & Related papers (2024-06-26T01:10:07Z) - AcceleratedLiNGAM: Learning Causal DAGs at the speed of GPUs [57.12929098407975]
We show that by efficiently parallelizing existing causal discovery methods, we can scale them to thousands of dimensions.
Specifically, we focus on the causal ordering subprocedure in DirectLiNGAM and implement GPU kernels to accelerate it.
This allows us to apply DirectLiNGAM to causal inference on large-scale gene expression data with genetic interventions yielding competitive results.
arXiv Detail & Related papers (2024-03-06T15:06:11Z) - Direct Diffusion Bridge using Data Consistency for Inverse Problems [65.04689839117692]
Diffusion model-based inverse problem solvers have shown impressive performance, but are limited in speed.
Several recent works have tried to alleviate this problem by building a diffusion process, directly bridging the clean and the corrupted.
We propose a modified inference procedure that imposes data consistency without the need for fine-tuning.
arXiv Detail & Related papers (2023-05-31T12:51:10Z) - Scaling up Stochastic Gradient Descent for Non-convex Optimisation [5.908471365011942]
We propose a novel approach to the problem of shared parallel computation.
By combining two strategies into a unified framework, DPSGD is a better trade computation framework.
The potential gains can be achieved by DPSGD on a deep learning (DRL) problem (Latent Diletrichal inference) and on a deep learning (DRL) problem (advantage actor - A2C)
arXiv Detail & Related papers (2022-10-06T13:06:08Z) - Differentiable Invariant Causal Discovery [106.87950048845308]
Learning causal structure from observational data is a fundamental challenge in machine learning.
This paper proposes Differentiable Invariant Causal Discovery (DICD) to avoid learning spurious edges and wrong causal directions.
Extensive experiments on synthetic and real-world datasets verify that DICD outperforms state-of-the-art causal discovery methods up to 36% in SHD.
arXiv Detail & Related papers (2022-05-31T09:29:07Z) - BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery [97.79015388276483]
A structural equation model (SEM) is an effective framework to reason over causal relationships represented via a directed acyclic graph (DAG)
Recent advances enabled effective maximum-likelihood point estimation of DAGs from observational data.
We propose BCD Nets, a variational framework for estimating a distribution over DAGs characterizing a linear-Gaussian SEM.
arXiv Detail & Related papers (2021-12-06T03:35:21Z) - Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud
Completion [90.26652899910019]
Chamfer Distance (CD) and Earth Mover's Distance (EMD) are two broadly adopted metrics for measuring the similarity between two point sets.
We propose a new similarity measure named Density-aware Chamfer Distance (DCD)
We show that DCD pays attention to both the overall structure and local details and provides a more reliable evaluation even when CD and contradict each other.
arXiv Detail & Related papers (2021-11-24T18:56:27Z) - Contrastive Divergence Learning is a Time Reversal Adversarial Game [32.46369991490501]
Contrastive divergence (CD) learning is a classical method for fitting unnormalized statistical models to data samples.
We show that CD is an adversarial learning procedure, where a discriminator attempts to classify whether a Markov chain generated from the model has been time-reversed.
Our derivation settles well with previous observations, which have concluded that CD's update steps cannot be expressed as the gradients of any fixed objective function.
arXiv Detail & Related papers (2020-12-06T15:54:05Z) - An Analysis of the Adaptation Speed of Causal Models [80.77896315374747]
Recently, Bengio et al. conjectured that among all candidate models, $G$ is the fastest to adapt from one dataset to another.
We investigate the adaptation speed of cause-effect SCMs using convergence rates from optimization.
Surprisingly, we find situations where the anticausal model is advantaged, falsifying the initial hypothesis.
arXiv Detail & Related papers (2020-05-18T23:48:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.