Related papers: Learning sources of variability from high-dimensional observational studies

Learning sources of variability from high-dimensional observational studies

URL: http://arxiv.org/abs/2307.13868v2
Date: Tue, 28 Nov 2023 21:59:49 GMT
Title: Learning sources of variability from high-dimensional observational studies
Authors: Eric W. Bridgeford, Jaewon Chung, Brian Gilbert, Sambit Panda, Adam Li, Cencheng Shen, Alexandra Badea, Brian Caffo, Joshua T. Vogelstein
Abstract summary: Causal inference studies whether the presence of a variable influences an observed outcome. Our work generalizes causal estimands to outcomes with any number of dimensions or any measurable space. We propose a simple technique for adjusting universally consistent conditional independence tests.
Score: 41.06757602546625
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Causal inference studies whether the presence of a variable influences an observed outcome. As measured by quantities such as the "average treatment effect," this paradigm is employed across numerous biological fields, from vaccine and drug development to policy interventions. Unfortunately, the majority of these methods are often limited to univariate outcomes. Our work generalizes causal estimands to outcomes with any number of dimensions or any measurable space, and formulates traditional causal estimands for nominal variables as causal discrepancy tests. We propose a simple technique for adjusting universally consistent conditional independence tests and prove that these tests are universally consistent causal discrepancy tests. Numerical experiments illustrate that our method, Causal CDcorr, leads to improvements in both finite sample validity and power when compared to existing strategies. Our methods are all open source and available at github.com/ebridge2/cdcorr.

Related papers

A Sample Efficient Conditional Independence Test in the Presence of Discretization [54.047334792855345]
Conditional Independence (CI) tests directly to discretized data can lead to incorrect conclusions.<n>Recent advancements have sought to infer the correct CI relationship between the latent variables through binarizing observed data.<n>Motivated by this, this paper introduces a sample-efficient CI test that does not rely on the binarization process.
arXiv Detail & Related papers (2025-06-10T12:41:26Z)
What Makes Treatment Effects Identifiable? Characterizations and Estimators Beyond Unconfoundedness [14.699342649039052]
We study general conditions that enable the identification of the average treatment effect.<n>We provide an interpretable condition that is sufficient and necessary for the identification of ATE.<n>We prove that ATE can be identified in regimes that prior works could not capture.
arXiv Detail & Related papers (2025-06-04T17:40:55Z)
Data Fusion for Partial Identification of Causal Effects [62.56890808004615]
We propose a novel partial identification framework that enables researchers to answer key questions.<n>Is the causal effect positive or negative? and How severe must assumption violations be to overturn this conclusion?<n>We apply our framework to the Project STAR study, which investigates the effect of classroom size on students' third-grade standardized test performance.
arXiv Detail & Related papers (2025-05-30T07:13:01Z)
Local Learning for Covariate Selection in Nonparametric Causal Effect Estimation with Latent Variables [13.12743473333296]
Estimating causal effects from nonexperimental data is a fundamental problem in many fields of science. We propose a novel local learning approach for covariate selection in nonparametric causal effect estimation. We validate our algorithm through extensive experiments on both synthetic and real-world data.
arXiv Detail & Related papers (2024-11-25T12:08:54Z)
Identification of Single-Treatment Effects in Factorial Experiments [0.0]
I show that when multiple interventions are randomized in experiments, the effect any single intervention would have outside the experimental setting is not identified absent heroic assumptions. observational studies and factorial experiments provide information about potential-outcome distributions with zero and multiple interventions. I show that researchers who rely on this type of design have to justify either linearity of functional forms or specify with Directed Acyclic Graphs how variables are related in the real world.
arXiv Detail & Related papers (2024-05-16T04:01:53Z)
Detecting critical treatment effect bias in small subgroups [11.437076464287822]
We propose a novel strategy to benchmark observational studies beyond the average treatment effect. First, we design a statistical test for the null hypothesis that the treatment effects estimated from the two studies, conditioned on a set of relevant features, differ up to some tolerance. We then estimate anally valid lower bound on the maximum bias strength for any subgroup in the observational study.
arXiv Detail & Related papers (2024-04-29T17:44:28Z)
The Blessings of Multiple Treatments and Outcomes in Treatment Effect Estimation [53.81860494566915]
Existing studies leveraged proxy variables or multiple treatments to adjust for confounding bias. In many real-world scenarios, there is greater interest in studying the effects on multiple outcomes. We show that parallel studies of multiple outcomes involved in this setting can assist each other in causal identification.
arXiv Detail & Related papers (2023-09-29T14:33:48Z)
Simultaneous inference for generalized linear models with unmeasured confounders [0.0]
We propose a unified statistical estimation and inference framework that harnesses structures and integrates linear projections into three key stages. We show effective Type-I error control of $z$-tests as sample and response sizes approach infinity.
arXiv Detail & Related papers (2023-09-13T18:53:11Z)
Valid Inference After Causal Discovery [73.87055989355737]
We develop tools for valid post-causal-discovery inference. We show that a naive combination of causal discovery and subsequent inference algorithms leads to highly inflated miscoverage rates.
arXiv Detail & Related papers (2022-08-11T17:40:45Z)
BaCaDI: Bayesian Causal Discovery with Unknown Interventions [118.93754590721173]
BaCaDI operates in the continuous space of latent probabilistic representations of both causal structures and interventions. In experiments on synthetic causal discovery tasks and simulated gene-expression data, BaCaDI outperforms related methods in identifying causal structures and intervention targets.
arXiv Detail & Related papers (2022-06-03T16:25:48Z)
Variance Minimization in the Wasserstein Space for Invariant Causal Prediction [72.13445677280792]
In this work, we show that the approach taken in ICP may be reformulated as a series of nonparametric tests that scales linearly in the number of predictors. Each of these tests relies on the minimization of a novel loss function that is derived from tools in optimal transport theory. We prove under mild assumptions that our method is able to recover the set of identifiable direct causes, and we demonstrate in our experiments that it is competitive with other benchmark causal discovery algorithms.
arXiv Detail & Related papers (2021-10-13T22:30:47Z)
Tracking disease outbreaks from sparse data with Bayesian inference [55.82986443159948]
The COVID-19 pandemic provides new motivation for estimating the empirical rate of transmission during an outbreak. Standard methods struggle to accommodate the partial observability and sparse data common at finer scales. We propose a Bayesian framework which accommodates partial observability in a principled manner.
arXiv Detail & Related papers (2020-09-12T20:37:33Z)
Active Invariant Causal Prediction: Experiment Selection through Stability [4.56877715768796]
In this work we propose a new active learning (i.e. experiment selection) framework (A-ICP) based on Invariant Causal Prediction (ICP) For general structural causal models, we characterize the effect of interventions on so-called stable sets. We propose several intervention selection policies for A-ICP which quickly reveal the direct causes of a response variable in the causal graph. Empirically, we analyze the performance of the proposed policies in both population and finite-regime experiments.
arXiv Detail & Related papers (2020-06-10T07:07:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.