Causal Inference for Genomic Data with Multiple Heterogeneous Outcomes
- URL: http://arxiv.org/abs/2404.09119v3
- Date: Fri, 25 Oct 2024 18:55:21 GMT
- Title: Causal Inference for Genomic Data with Multiple Heterogeneous Outcomes
- Authors: Jin-Hong Du, Zhenghao Zeng, Edward H. Kennedy, Larry Wasserman, Kathryn Roeder,
- Abstract summary: We propose a generic semiparametric framework for doubly robust estimation with multiple derived outcomes.
We specialize the analysis to standardized average treatment effects and quantile treatment effects.
- Score: 1.5845117761091052
- License:
- Abstract: With the evolution of single-cell RNA sequencing techniques into a standard approach in genomics, it has become possible to conduct cohort-level causal inferences based on single-cell-level measurements. However, the individual gene expression levels of interest are not directly observable; instead, only repeated proxy measurements from each individual's cells are available, providing a derived outcome to estimate the underlying outcome for each of many genes. In this paper, we propose a generic semiparametric inference framework for doubly robust estimation with multiple derived outcomes, which also encompasses the usual setting of multiple outcomes when the response of each unit is available. To reliably quantify the causal effects of heterogeneous outcomes, we specialize the analysis to standardized average treatment effects and quantile treatment effects. Through this, we demonstrate the use of the semiparametric inferential results for doubly robust estimators derived from both Von Mises expansions and estimating equations. A multiple testing procedure based on Gaussian multiplier bootstrap is tailored for doubly robust estimators to control the false discovery exceedance rate. Applications in single-cell CRISPR perturbation analysis and individual-level differential expression analysis demonstrate the utility of the proposed methods and offer insights into the usage of different estimands for causal inference in genomics.
Related papers
- Semiparametric conformal prediction [79.6147286161434]
Risk-sensitive applications require well-calibrated prediction sets over multiple, potentially correlated target variables.
We treat the scores as random vectors and aim to construct the prediction set accounting for their joint correlation structure.
We report desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z) - A convex formulation of covariate-adjusted Gaussian graphical models via natural parametrization [6.353176264090468]
In co-expression quantitative locus (eQTL) studies, both the mean expression level of genes as well as their conditional independence structure may be adjusted by variants local to those genes.
Existing methods to estimate co-adjusted GGMs either allow only the mean to depend on covariates or suffer from poor scaling assumptions due to the inherent non-wiseity of simultaneously estimating the mean and precision matrix.
arXiv Detail & Related papers (2024-10-08T20:02:10Z) - Assumption-Lean Post-Integrated Inference with Negative Control Outcomes [0.0]
We introduce a robust post-integrated inference (PII) method that adjusts for latent heterogeneity using negative control outcomes.
Our assumption-lean semi inference method extends robustness and generality to projected direct effect estimands that account for mediators, confounders, and moderators.
The proposed doubly robust estimators are consistent and efficient under minimal assumptions, facilitating data-adaptive estimation with machine learning algorithms.
arXiv Detail & Related papers (2024-10-07T12:52:38Z) - Precise analysis of ridge interpolators under heavy correlations -- a Random Duality Theory view [0.0]
We show that emphRandom Duality Theory (RDT) can be utilized to obtain precise closed form characterizations of all estimators related optimizing quantities of interest.
arXiv Detail & Related papers (2024-06-13T14:56:52Z) - Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - Simultaneous inference for generalized linear models with unmeasured confounders [0.0]
We propose a unified statistical estimation and inference framework that harnesses structures and integrates linear projections into three key stages.
We show effective Type-I error control of $z$-tests as sample and response sizes approach infinity.
arXiv Detail & Related papers (2023-09-13T18:53:11Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Mitigating multiple descents: A model-agnostic framework for risk
monotonization [84.6382406922369]
We develop a general framework for risk monotonization based on cross-validation.
We propose two data-driven methodologies, namely zero- and one-step, that are akin to bagging and boosting.
arXiv Detail & Related papers (2022-05-25T17:41:40Z) - Variance Minimization in the Wasserstein Space for Invariant Causal
Prediction [72.13445677280792]
In this work, we show that the approach taken in ICP may be reformulated as a series of nonparametric tests that scales linearly in the number of predictors.
Each of these tests relies on the minimization of a novel loss function that is derived from tools in optimal transport theory.
We prove under mild assumptions that our method is able to recover the set of identifiable direct causes, and we demonstrate in our experiments that it is competitive with other benchmark causal discovery algorithms.
arXiv Detail & Related papers (2021-10-13T22:30:47Z) - Inference for High Dimensional Censored Quantile Regression [8.993036560782137]
This paper proposes a novel procedure to draw inference on all predictors within the framework of global censored quantile regression.
We show that our procedure can properly quantify the uncertainty of the estimates in high dimensional settings.
We apply our method to analyze the heterogeneous effects of SNPs residing in lung cancer pathways on patients' survival.
arXiv Detail & Related papers (2021-07-22T23:57:06Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.