Learning Linear Causal Representations from Interventions under General
Nonlinear Mixing
- URL: http://arxiv.org/abs/2306.02235v2
- Date: Mon, 18 Dec 2023 15:19:43 GMT
- Title: Learning Linear Causal Representations from Interventions under General
Nonlinear Mixing
- Authors: Simon Buchholz, Goutham Rajendran, Elan Rosenfeld, Bryon Aragam,
Bernhard Sch\"olkopf, Pradeep Ravikumar
- Abstract summary: We prove strong identifiability results given unknown single-node interventions without access to the intervention targets.
This is the first instance of causal identifiability from non-paired interventions for deep neural network embeddings.
- Score: 52.66151568785088
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of learning causal representations from unknown, latent
interventions in a general setting, where the latent distribution is Gaussian
but the mixing function is completely general. We prove strong identifiability
results given unknown single-node interventions, i.e., without having access to
the intervention targets. This generalizes prior works which have focused on
weaker classes, such as linear maps or paired counterfactual data. This is also
the first instance of causal identifiability from non-paired interventions for
deep neural network embeddings. Our proof relies on carefully uncovering the
high-dimensional geometric structure present in the data distribution after a
non-linear density transformation, which we capture by analyzing quadratic
forms of precision matrices of the latent distributions. Finally, we propose a
contrastive algorithm to identify the latent variables in practice and evaluate
its performance on various tasks.
Related papers
- Identifiable Latent Neural Causal Models [82.14087963690561]
Causal representation learning seeks to uncover latent, high-level causal representations from low-level observed data.
We determine the types of distribution shifts that do contribute to the identifiability of causal representations.
We translate our findings into a practical algorithm, allowing for the acquisition of reliable latent causal representations.
arXiv Detail & Related papers (2024-03-23T04:13:55Z) - Learning nonparametric latent causal graphs with unknown interventions [18.6470340274888]
We establish conditions under which latent causal graphs are nonparametrically identifiable.
We do not assume the number of hidden variables is known, and we show that at most one unknown intervention per hidden variable is needed.
arXiv Detail & Related papers (2023-06-05T14:06:35Z) - Nonparametric Identifiability of Causal Representations from Unknown
Interventions [63.1354734978244]
We study causal representation learning, the task of inferring latent causal variables and their causal relations from mixtures of the variables.
Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data.
arXiv Detail & Related papers (2023-06-01T10:51:58Z) - Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics
for Convex Losses in High-Dimension [25.711297863946193]
We develop a theory for the study of fluctuations in an ensemble of generalised linear models trained on different, but correlated, features.
We provide a complete description of the joint distribution of the empirical risk minimiser for generic convex loss and regularisation in the high-dimensional limit.
arXiv Detail & Related papers (2022-01-31T17:44:58Z) - Nonlinear Invariant Risk Minimization: A Causal Approach [5.63479133344366]
We propose a learning paradigm that enables out-of-distribution generalization in the nonlinear setting.
We show identifiability of the data representation up to very simple transformations.
Extensive experiments on both synthetic and real-world datasets show that our approach significantly outperforms a variety of baseline methods.
arXiv Detail & Related papers (2021-02-24T15:38:41Z) - The Hidden Uncertainty in a Neural Networks Activations [105.4223982696279]
The distribution of a neural network's latent representations has been successfully used to detect out-of-distribution (OOD) data.
This work investigates whether this distribution correlates with a model's epistemic uncertainty, thus indicating its ability to generalise to novel inputs.
arXiv Detail & Related papers (2020-12-05T17:30:35Z) - General stochastic separation theorems with optimal bounds [68.8204255655161]
Phenomenon of separability was revealed and used in machine learning to correct errors of Artificial Intelligence (AI) systems and analyze AI instabilities.
Errors or clusters of errors can be separated from the rest of the data.
The ability to correct an AI system also opens up the possibility of an attack on it, and the high dimensionality induces vulnerabilities caused by the same separability.
arXiv Detail & Related papers (2020-10-11T13:12:41Z) - Generalization Error for Linear Regression under Distributed Learning [0.0]
We consider the setting where the unknowns are distributed over a network of nodes.
We present an analytical characterization of the dependence of the generalization error on the partitioning of the unknowns over nodes.
arXiv Detail & Related papers (2020-04-30T08:49:46Z) - Semiparametric Nonlinear Bipartite Graph Representation Learning with
Provable Guarantees [106.91654068632882]
We consider the bipartite graph and formalize its representation learning problem as a statistical estimation problem of parameters in a semiparametric exponential family distribution.
We show that the proposed objective is strongly convex in a neighborhood around the ground truth, so that a gradient descent-based method achieves linear convergence rate.
Our estimator is robust to any model misspecification within the exponential family, which is validated in extensive experiments.
arXiv Detail & Related papers (2020-03-02T16:40:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.