A Distance Covariance-based Kernel for Nonlinear Causal Clustering in
Heterogeneous Populations
- URL: http://arxiv.org/abs/2106.03480v1
- Date: Mon, 7 Jun 2021 10:16:34 GMT
- Title: A Distance Covariance-based Kernel for Nonlinear Causal Clustering in
Heterogeneous Populations
- Authors: Alex Markham and Moritz Grosse-Wentrup
- Abstract summary: We introduce a distance covariance-based kernel designed specifically to measure the similarity between the underlying nonlinear causal structures of different samples.
This kernel enables us to perform clustering to identify the homogeneous subpopulations.
We demonstrate using our kernel for causal clustering with an application in genetics, allowing us to reason about the latent transcription factor networks regulating measured gene expression levels.
- Score: 1.2763567932588586
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of causal structure learning in the setting of
heterogeneous populations, i.e., populations in which a single causal structure
does not adequately represent all population members, as is common in
biological and social sciences. To this end, we introduce a distance
covariance-based kernel designed specifically to measure the similarity between
the underlying nonlinear causal structures of different samples. This kernel
enables us to perform clustering to identify the homogeneous subpopulations.
Indeed, we prove the corresponding feature map is a statistically consistent
estimator of nonlinear independence structure, rendering the kernel itself a
statistical test for the hypothesis that sets of samples come from different
generating causal structures. We can then use existing methods to learn a
causal structure for each of these subpopulations. We demonstrate using our
kernel for causal clustering with an application in genetics, allowing us to
reason about the latent transcription factor networks regulating measured gene
expression levels.
Related papers
- Causal Discovery of Linear Non-Gaussian Causal Models with Unobserved Confounding [1.6932009464531739]
We consider linear non-Gaussian structural equation models that involve latent confounding.
In this setting, the causal structure is identifiable, but, in general, it is not possible to identify the specific causal effects.
arXiv Detail & Related papers (2024-08-09T07:24:12Z) - Causal K-Means Clustering [5.087519744951637]
Causal k-Means Clustering harnesses the widely-used k-means clustering algorithm to uncover the unknown subgroup structure.
We present a plug-in estimator which is simple and readily implementable using off-the-shelf algorithms.
Our proposed methods are especially useful for modern outcome-wide studies with multiple treatment levels.
arXiv Detail & Related papers (2024-05-05T23:59:51Z) - Learning Linear Causal Representations from Interventions under General
Nonlinear Mixing [52.66151568785088]
We prove strong identifiability results given unknown single-node interventions without access to the intervention targets.
This is the first instance of causal identifiability from non-paired interventions for deep neural network embeddings.
arXiv Detail & Related papers (2023-06-04T02:32:12Z) - Nonparametric Identifiability of Causal Representations from Unknown
Interventions [63.1354734978244]
We study causal representation learning, the task of inferring latent causal variables and their causal relations from mixtures of the variables.
Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data.
arXiv Detail & Related papers (2023-06-01T10:51:58Z) - Causal Discovery in Linear Latent Variable Models Subject to Measurement
Error [29.78435955758185]
We focus on causal discovery in the presence of measurement error in linear systems.
We demonstrate a surprising connection between this problem and causal discovery in the presence of unobserved parentless causes.
arXiv Detail & Related papers (2022-11-08T03:43:14Z) - BaCaDI: Bayesian Causal Discovery with Unknown Interventions [118.93754590721173]
BaCaDI operates in the continuous space of latent probabilistic representations of both causal structures and interventions.
In experiments on synthetic causal discovery tasks and simulated gene-expression data, BaCaDI outperforms related methods in identifying causal structures and intervention targets.
arXiv Detail & Related papers (2022-06-03T16:25:48Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Perfect Spectral Clustering with Discrete Covariates [68.8204255655161]
We propose a spectral algorithm that achieves perfect clustering with high probability on a class of large, sparse networks.
Our method is the first to offer a guarantee of consistent latent structure recovery using spectral clustering.
arXiv Detail & Related papers (2022-05-17T01:41:06Z) - Effect Identification in Cluster Causal Diagrams [51.42809552422494]
We introduce a new type of graphical model called cluster causal diagrams (for short, C-DAGs)
C-DAGs allow for the partial specification of relationships among variables based on limited prior knowledge.
We develop the foundations and machinery for valid causal inferences over C-DAGs.
arXiv Detail & Related papers (2022-02-22T21:27:31Z) - CCSL: A Causal Structure Learning Method from Multiple Unknown
Environments [32.61349047509467]
We propose a unified Causal Cluster Structures Learning (named CCSL) method for causal discovery from non-i.i.d. data.
This method simultaneously integrates the following two tasks: 1) clustering subjects with the same causal mechanism; 2) learning causal structures from the samples of subjects.
arXiv Detail & Related papers (2021-11-18T12:50:53Z) - Blocked Clusterwise Regression [0.0]
We generalize previous approaches to discrete unobserved heterogeneity by allowing each unit to have multiple latent variables.
We contribute to the theory of clustering with an over-specified number of clusters and derive new convergence rates for this setting.
arXiv Detail & Related papers (2020-01-29T23:29:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.