Causal Discovery with Score Matching on Additive Models with Arbitrary
Noise
- URL: http://arxiv.org/abs/2304.03265v1
- Date: Thu, 6 Apr 2023 17:50:46 GMT
- Title: Causal Discovery with Score Matching on Additive Models with Arbitrary
Noise
- Authors: Francesco Montagna, Nicoletta Noceti, Lorenzo Rosasco, Kun Zhang,
Francesco Locatello
- Abstract summary: Causal discovery methods are intrinsically constrained by the set of assumptions needed to ensure structure identifiability.
In this paper we show the shortcomings of inference under this hypothesis, analyzing the risk of edge inversion under violation of Gaussianity of the noise terms.
We propose a novel method for inferring the topological ordering of the variables in the causal graph, from data generated according to an additive non-linear model with a generic noise distribution.
This leads to NoGAM, a causal discovery algorithm with a minimal set of assumptions and state of the art performance, experimentally benchmarked on synthetic data.
- Score: 37.13308785728276
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Causal discovery methods are intrinsically constrained by the set of
assumptions needed to ensure structure identifiability. Moreover additional
restrictions are often imposed in order to simplify the inference task: this is
the case for the Gaussian noise assumption on additive non-linear models, which
is common to many causal discovery approaches. In this paper we show the
shortcomings of inference under this hypothesis, analyzing the risk of edge
inversion under violation of Gaussianity of the noise terms. Then, we propose a
novel method for inferring the topological ordering of the variables in the
causal graph, from data generated according to an additive non-linear model
with a generic noise distribution. This leads to NoGAM (Not only Gaussian
Additive noise Models), a causal discovery algorithm with a minimal set of
assumptions and state of the art performance, experimentally benchmarked on
synthetic data.
Related papers
- A Skewness-Based Criterion for Addressing Heteroscedastic Noise in Causal Discovery [47.36895591886043]
We investigate heteroscedastic symmetric noise models (HSNMs)
We introduce a novel criterion for identifying HSNMs based on the skewness of the score (i.e., the gradient of the log density) of the data distribution.
We propose SkewScore, an algorithm that handles heteroscedastic noise without requiring the extraction of external noise.
arXiv Detail & Related papers (2024-10-08T22:28:30Z) - Information limits and Thouless-Anderson-Palmer equations for spiked matrix models with structured noise [19.496063739638924]
We consider a saturate problem of Bayesian inference for a structured spiked model.
We show how to predict the statistical limits using an efficient algorithm inspired by the theory of adaptive Thouless-Anderson-Palmer equations.
arXiv Detail & Related papers (2024-05-31T16:38:35Z) - Robust Estimation of Causal Heteroscedastic Noise Models [7.568978862189266]
Student's $t$-distribution is known for its robustness in accounting for sampling variability with smaller sample sizes and extreme values without significantly altering the overall distribution shape.
Our empirical evaluations demonstrate that our estimators are more robust and achieve better overall performance across synthetic and real benchmarks.
arXiv Detail & Related papers (2023-12-15T02:26:35Z) - MissDAG: Causal Discovery in the Presence of Missing Data with
Continuous Additive Noise Models [78.72682320019737]
We develop a general method, which we call MissDAG, to perform causal discovery from data with incomplete observations.
MissDAG maximizes the expected likelihood of the visible part of observations under the expectation-maximization framework.
We demonstrate the flexibility of MissDAG for incorporating various causal discovery algorithms and its efficacy through extensive simulations and real data experiments.
arXiv Detail & Related papers (2022-05-27T09:59:46Z) - Score matching enables causal discovery of nonlinear additive noise
models [63.93669924730725]
We show how to design a new generation of scalable causal discovery methods.
We propose a new efficient method for approximating the score's Jacobian, enabling to recover the causal graph.
arXiv Detail & Related papers (2022-03-08T21:34:46Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - Sequential Learning of the Topological Ordering for the Linear
Non-Gaussian Acyclic Model with Parametric Noise [6.866717993664787]
We develop a novel sequential approach to estimate the causal ordering of a DAG.
We provide extensive numerical evidence to demonstrate that our procedure is scalable to cases with possibly thousands of nodes.
arXiv Detail & Related papers (2022-02-03T18:15:48Z) - Estimation of Bivariate Structural Causal Models by Variational Gaussian
Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models.
One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z) - On the Role of Entropy-based Loss for Learning Causal Structures with
Continuous Optimization [27.613220411996025]
A method with non-combinatorial directed acyclic constraint, called NOTEARS, formulates the causal structure learning problem as a continuous optimization problem using least-square loss.
We show that the violation of the Gaussian noise assumption will hinder the causal direction identification.
We propose a more general entropy-based loss that is theoretically consistent with the likelihood score under any noise distribution.
arXiv Detail & Related papers (2021-06-05T08:29:51Z) - Shape Matters: Understanding the Implicit Bias of the Noise Covariance [76.54300276636982]
Noise in gradient descent provides a crucial implicit regularization effect for training over parameterized models.
We show that parameter-dependent noise -- induced by mini-batches or label perturbation -- is far more effective than Gaussian noise.
Our analysis reveals that parameter-dependent noise introduces a bias towards local minima with smaller noise variance, whereas spherical Gaussian noise does not.
arXiv Detail & Related papers (2020-06-15T18:31:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.