On the Identifiability and Estimation of Causal Location-Scale Noise
Models
- URL: http://arxiv.org/abs/2210.09054v2
- Date: Thu, 1 Jun 2023 08:50:54 GMT
- Title: On the Identifiability and Estimation of Causal Location-Scale Noise
Models
- Authors: Alexander Immer, Christoph Schultheiss, Julia E. Vogt, Bernhard
Sch\"olkopf, Peter B\"uhlmann, Alexander Marx
- Abstract summary: We study the class of location-scale or heteroscedastic noise models (LSNMs)
We show the causal direction is identifiable up to some pathological cases.
We propose two estimators for LSNMs: an estimator based on (non-linear) feature maps, and one based on neural networks.
- Score: 122.65417012597754
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the class of location-scale or heteroscedastic noise models (LSNMs),
in which the effect $Y$ can be written as a function of the cause $X$ and a
noise source $N$ independent of $X$, which may be scaled by a positive function
$g$ over the cause, i.e., $Y = f(X) + g(X)N$. Despite the generality of the
model class, we show the causal direction is identifiable up to some
pathological cases. To empirically validate these theoretical findings, we
propose two estimators for LSNMs: an estimator based on (non-linear) feature
maps, and one based on neural networks. Both model the conditional distribution
of $Y$ given $X$ as a Gaussian parameterized by its natural parameters. When
the feature maps are correctly specified, we prove that our estimator is
jointly concave, and a consistent estimator for the cause-effect identification
task. Although the the neural network does not inherit those guarantees, it can
fit functions of arbitrary complexity, and reaches state-of-the-art performance
across benchmarks.
Related papers
- Causal Discovery from Poisson Branching Structural Causal Model Using High-Order Cumulant with Path Analysis [24.826219353338132]
One of the most common characteristics of count data is the inherent branching structure described by a binomial thinning operator.
A single causal pair is Markov equivalent, i.e., $Xrightarrow Y$ and $Yrightarrow X$ are distributed equivalent.
We propose a Poisson Branching Structure Causal Model (PB-SCM) and perform a path analysis on PB-SCM using high-order cumulants.
arXiv Detail & Related papers (2024-03-25T08:06:08Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Random features and polynomial rules [0.0]
We present a generalization of the performance of random features models for generic supervised learning problems with Gaussian data.
We find good agreement far from the limits where $Dto infty$ and at least one between $P/DK$, $N/DL$ remains finite.
arXiv Detail & Related papers (2024-02-15T18:09:41Z) - Cause-Effect Inference in Location-Scale Noise Models: Maximum
Likelihood vs. Independence Testing [19.23479356810746]
A fundamental problem of causal discovery is cause-effect inference, learning the correct causal direction between two random variables.
Recently introduced heteroscedastic location-scale noise functional models (LSNMs) combine expressive power with identifiability guarantees.
We show that LSNM model selection based on maximizing likelihood achieves state-of-the-art accuracy, when the noise distributions are correctly specified.
arXiv Detail & Related papers (2023-01-26T20:48:32Z) - Asymptotic Statistical Analysis of $f$-divergence GAN [13.587087960403199]
Generative Adversarial Networks (GANs) have achieved great success in data generation.
We consider the statistical behavior of the general $f$-divergence formulation of GAN.
The resulting estimation method is referred to as Adversarial Gradient Estimation (AGE)
arXiv Detail & Related papers (2022-09-14T18:08:37Z) - CARD: Classification and Regression Diffusion Models [51.0421331214229]
We introduce classification and regression diffusion (CARD) models, which combine a conditional generative model and a pre-trained conditional mean estimator.
We demonstrate the outstanding ability of CARD in conditional distribution prediction with both toy examples and real-world datasets.
arXiv Detail & Related papers (2022-06-15T03:30:38Z) - Inverting brain grey matter models with likelihood-free inference: a
tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI.
We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells.
We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z) - The Causal Neural Connection: Expressiveness, Learnability, and
Inference [125.57815987218756]
An object called structural causal model (SCM) represents a collection of mechanisms and sources of random variation of the system under investigation.
In this paper, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020) still holds for neural models.
We introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences.
arXiv Detail & Related papers (2021-07-02T01:55:18Z) - Estimation in Tensor Ising Models [5.161531917413708]
We consider the problem of estimating the natural parameter of the $p$-tensor Ising model given a single sample from the distribution on $N$ nodes.
In particular, we show the $sqrt N$-consistency of the MPL estimate in the $p$-spin Sherrington-Kirkpatrick (SK) model.
We derive the precise fluctuations of the MPL estimate in the special case of the $p$-tensor Curie-Weiss model.
arXiv Detail & Related papers (2020-08-29T00:06:58Z) - The Generalized Lasso with Nonlinear Observations and Generative Priors [63.541900026673055]
We make the assumption of sub-Gaussian measurements, which is satisfied by a wide range of measurement models.
We show that our result can be extended to the uniform recovery guarantee under the assumption of a so-called local embedding property.
arXiv Detail & Related papers (2020-06-22T16:43:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.