Related papers: The leave-one-covariate-out conditional randomization test

The leave-one-covariate-out conditional randomization test

URL: http://arxiv.org/abs/2006.08482v2
Date: Mon, 13 Jul 2020 14:34:28 GMT
Title: The leave-one-covariate-out conditional randomization test
Authors: Eugene Katsevich and Aaditya Ramdas
Abstract summary: Conditional independence testing is an important problem, yet provably hard without assumptions. Knockoffs is a popular methodology associated with this framework, but it suffers from two main drawbacks. The conditional randomization test (CRT) is thought to be the "right" solution under model-X, but usually viewed as computationally inefficient.
Score: 36.9351790405311
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Conditional independence testing is an important problem, yet provably hard without assumptions. One of the assumptions that has become popular of late is called "model-X", where we assume we know the joint distribution of the covariates, but assume nothing about the conditional distribution of the outcome given the covariates. Knockoffs is a popular methodology associated with this framework, but it suffers from two main drawbacks: only one-bit $p$-values are available for inference on each variable, and the method is randomized with significant variability across runs in practice. The conditional randomization test (CRT) is thought to be the "right" solution under model-X, but usually viewed as computationally inefficient. This paper proposes a computationally efficient leave-one-covariate-out (LOCO) CRT that addresses both drawbacks of knockoffs. LOCO CRT produces valid $p$-values that can be used to control the familywise error rate, and has nearly zero algorithmic variability. For L1 regularized M-estimators, we develop an even faster variant called L1ME CRT, which reuses computation by leveraging a novel observation about the stability of the cross-validated lasso to removing inactive variables. Last, for multivariate Gaussian covariates, we present a closed form expression for the LOCO CRT $p$-value, thus completely eliminating resampling in this important special case.

Related papers

Towards Self-Supervised Covariance Estimation in Deep Heteroscedastic Regression [102.24287051757469]
We study self-supervised covariance estimation in deep heteroscedastic regression. We derive an upper bound on the 2-Wasserstein distance between normal distributions. Experiments over a wide range of synthetic and real datasets demonstrate that the proposed 2-Wasserstein bound coupled with pseudo label annotations results in a computationally cheaper yet accurate deep heteroscedastic regression.
arXiv Detail & Related papers (2025-02-14T22:37:11Z)
Asymptotic FDR Control with Model-X Knockoffs: Is Moments Matching Sufficient? [6.6716279375012295]
We propose a unified theoretical framework for studying the robustness of the model-X knockoffs framework. For the first time in the literature, our theoretical results justify formally the effectiveness and inference of the Gaussian knockoffs generator.
arXiv Detail & Related papers (2025-02-09T17:36:00Z)
Conditional Diffusion Models Based Conditional Independence Testing [8.34871567507739]
Conditional randomization test (CRT) was recently introduced to test whether two random variables, $X$ and $Y$, are conditionally independent. We propose using conditional diffusion models (CDMs) to learn the distribution of $X|Z$.
arXiv Detail & Related papers (2024-12-16T13:03:18Z)
Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs. We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint. We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z)
Wasserstein F-tests for Fréchet regression on Bures-Wasserstein manifolds [0.9514940899499753]
Fr'echet regression on the Bures-Wasserstein manifold is developed. A test for the null hypothesis of no association is proposed. Results show that the proposed test has the desired level of significanceally.
arXiv Detail & Related papers (2024-04-05T04:01:51Z)
Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models. In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z)
Nearest-Neighbor Sampling Based Conditional Independence Testing [15.478671471695794]
Conditional randomization test (CRT) was recently proposed to test whether two random variables X and Y are conditionally independent given random variables Z. The aim of this paper is to develop a novel alternative of CRT by using nearest-neighbor sampling without assuming the exact form of the distribution of X given Z.
arXiv Detail & Related papers (2023-04-09T07:54:36Z)
The Projected Covariance Measure for assumption-lean variable significance testing [3.8936058127056357]
A simple but common approach is to specify a linear model, and then test whether the regression coefficient for $X$ is non-zero. We study the problem of testing the model-free null of conditional mean independence, i.e. that the conditional mean of $Y$ given $X$ and $Z$ does not depend on $X$. We propose a simple and general framework that can leverage flexible nonparametric or machine learning methods, such as additive models or random forests.
arXiv Detail & Related papers (2022-11-03T17:55:50Z)
DIET: Conditional independence testing with marginal dependence measures of residual information [30.99595500331328]
Conditional randomization tests (CRTs) assess whether a variable $x$ is predictive of another variable $y$. Existing solutions to reduce the cost of CRTs typically split the dataset into a train and a test portion. We propose the decoupled independence test (DIET), an algorithm that avoids both of these issues.
arXiv Detail & Related papers (2022-08-18T00:48:04Z)
A Conditional Randomization Test for Sparse Logistic Regression in High-Dimension [36.00360315353985]
emphCRT-logit is an algorithm that combines a variable-distillation step and a decorrelation step. We provide a theoretical analysis of this procedure, and demonstrate its effectiveness on simulations, along with experiments on large-scale brain-imaging and genomics datasets.
arXiv Detail & Related papers (2022-05-29T09:37:16Z)
Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process. We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator. We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z)
Multivariate Probabilistic Regression with Natural Gradient Boosting [63.58097881421937]
We propose a Natural Gradient Boosting (NGBoost) approach based on nonparametrically modeling the conditional parameters of the multivariate predictive distribution. Our method is robust, works out-of-the-box without extensive tuning, is modular with respect to the assumed target distribution, and performs competitively in comparison to existing approaches.
arXiv Detail & Related papers (2021-06-07T17:44:49Z)
FANOK: Knockoffs in Linear Time [73.5154025911318]
We describe a series of algorithms that efficiently implement Gaussian model-X knockoffs to control the false discovery rate on large scale feature selection problems. We test our methods on problems with $p$ as large as $500,000$.
arXiv Detail & Related papers (2020-06-15T21:55:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.