Cause-Effect Inference in Location-Scale Noise Models: Maximum
Likelihood vs. Independence Testing
- URL: http://arxiv.org/abs/2301.12930v3
- Date: Thu, 26 Oct 2023 00:36:22 GMT
- Title: Cause-Effect Inference in Location-Scale Noise Models: Maximum
Likelihood vs. Independence Testing
- Authors: Xiangyu Sun, Oliver Schulte
- Abstract summary: A fundamental problem of causal discovery is cause-effect inference, learning the correct causal direction between two random variables.
Recently introduced heteroscedastic location-scale noise functional models (LSNMs) combine expressive power with identifiability guarantees.
We show that LSNM model selection based on maximizing likelihood achieves state-of-the-art accuracy, when the noise distributions are correctly specified.
- Score: 19.23479356810746
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A fundamental problem of causal discovery is cause-effect inference, learning
the correct causal direction between two random variables. Significant progress
has been made through modelling the effect as a function of its cause and a
noise term, which allows us to leverage assumptions about the generating
function class. The recently introduced heteroscedastic location-scale noise
functional models (LSNMs) combine expressive power with identifiability
guarantees. LSNM model selection based on maximizing likelihood achieves
state-of-the-art accuracy, when the noise distributions are correctly
specified. However, through an extensive empirical evaluation, we demonstrate
that the accuracy deteriorates sharply when the form of the noise distribution
is misspecified by the user. Our analysis shows that the failure occurs mainly
when the conditional variance in the anti-causal direction is smaller than that
in the causal direction. As an alternative, we find that causal model selection
through residual independence testing is much more robust to noise
misspecification and misleading conditional variance.
Related papers
- Robust Estimation of Causal Heteroscedastic Noise Models [7.568978862189266]
Student's $t$-distribution is known for its robustness in accounting for sampling variability with smaller sample sizes and extreme values without significantly altering the overall distribution shape.
Our empirical evaluations demonstrate that our estimators are more robust and achieve better overall performance across synthetic and real benchmarks.
arXiv Detail & Related papers (2023-12-15T02:26:35Z) - Distinguishing Cause from Effect on Categorical Data: The Uniform
Channel Model [0.0]
Distinguishing cause from effect using observations of a pair of random variables is a core problem in causal discovery.
We propose a criterion to address the cause-effect problem with categorical variables.
We select as the most likely causal direction the one in which the conditional probability mass function is closer to a uniform channel (UC)
arXiv Detail & Related papers (2023-03-14T13:54:11Z) - On the Identifiability and Estimation of Causal Location-Scale Noise
Models [122.65417012597754]
We study the class of location-scale or heteroscedastic noise models (LSNMs)
We show the causal direction is identifiable up to some pathological cases.
We propose two estimators for LSNMs: an estimator based on (non-linear) feature maps, and one based on neural networks.
arXiv Detail & Related papers (2022-10-13T17:18:59Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - Partial Identification with Noisy Covariates: A Robust Optimization
Approach [94.10051154390237]
Causal inference from observational datasets often relies on measuring and adjusting for covariates.
We show that this robust optimization approach can extend a wide range of causal adjustment methods to perform partial identification.
Across synthetic and real datasets, we find that this approach provides ATE bounds with a higher coverage probability than existing methods.
arXiv Detail & Related papers (2022-02-22T04:24:26Z) - Analyzing and Improving the Optimization Landscape of Noise-Contrastive
Estimation [50.85788484752612]
Noise-contrastive estimation (NCE) is a statistically consistent method for learning unnormalized probabilistic models.
It has been empirically observed that the choice of the noise distribution is crucial for NCE's performance.
In this work, we formally pinpoint reasons for NCE's poor performance when an inappropriate noise distribution is used.
arXiv Detail & Related papers (2021-10-21T16:57:45Z) - Causal Identification with Additive Noise Models: Quantifying the Effect
of Noise [5.037636944933989]
This work investigates the impact of different noise levels on the ability of Additive Noise Models to identify the direction of the causal relationship.
We use an exhaustive range of models where the level of additive noise gradually changes from 1% to 10000% of the causes' noise level.
The results of the experiments show that ANMs methods can fail to capture the true causal direction for some levels of noise.
arXiv Detail & Related papers (2021-10-15T13:28:33Z) - Estimation of Bivariate Structural Causal Models by Variational Gaussian
Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models.
One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z) - The Effect of Noise Level on Causal Identification with Additive Noise
Models [0.0]
We consider the impact of different noise levels on the ability of Additive Noise Models to identify the direction of the causal relationship.
Two specific methods have been selected, textitRegression with Subsequent Independence Test and textitIdentification using Conditional Variances
The results of the experiments show that these methods can fail to capture the true causal direction for some levels of noise.
arXiv Detail & Related papers (2021-08-24T11:18:41Z) - Shape Matters: Understanding the Implicit Bias of the Noise Covariance [76.54300276636982]
Noise in gradient descent provides a crucial implicit regularization effect for training over parameterized models.
We show that parameter-dependent noise -- induced by mini-batches or label perturbation -- is far more effective than Gaussian noise.
Our analysis reveals that parameter-dependent noise introduces a bias towards local minima with smaller noise variance, whereas spherical Gaussian noise does not.
arXiv Detail & Related papers (2020-06-15T18:31:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.