Nearest-Neighbor Sampling Based Conditional Independence Testing
- URL: http://arxiv.org/abs/2304.04183v1
- Date: Sun, 9 Apr 2023 07:54:36 GMT
- Title: Nearest-Neighbor Sampling Based Conditional Independence Testing
- Authors: Shuai Li, Ziqi Chen, Hongtu Zhu, Christina Dan Wang, Wang Wen
- Abstract summary: Conditional randomization test (CRT) was recently proposed to test whether two random variables X and Y are conditionally independent given random variables Z.
The aim of this paper is to develop a novel alternative of CRT by using nearest-neighbor sampling without assuming the exact form of the distribution of X given Z.
- Score: 15.478671471695794
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The conditional randomization test (CRT) was recently proposed to test
whether two random variables X and Y are conditionally independent given random
variables Z. The CRT assumes that the conditional distribution of X given Z is
known under the null hypothesis and then it is compared to the distribution of
the observed samples of the original data. The aim of this paper is to develop
a novel alternative of CRT by using nearest-neighbor sampling without assuming
the exact form of the distribution of X given Z. Specifically, we utilize the
computationally efficient 1-nearest-neighbor to approximate the conditional
distribution that encodes the null hypothesis. Then, theoretically, we show
that the distribution of the generated samples is very close to the true
conditional distribution in terms of total variation distance. Furthermore, we
take the classifier-based conditional mutual information estimator as our test
statistic. The test statistic as an empirical fundamental information theoretic
quantity is able to well capture the conditional-dependence feature. We show
that our proposed test is computationally very fast, while controlling type I
and II errors quite well. Finally, we demonstrate the efficiency of our
proposed test in both synthetic and real data analyses.
Related papers
- Doubly Robust Conditional Independence Testing with Generative Neural Networks [8.323172773256449]
This article addresses the problem of testing the conditional independence of two generic random vectors $X$ and $Y$ given a third random vector $Z$.
We propose a new non-parametric testing procedure that avoids explicitly estimating any conditional distributions.
arXiv Detail & Related papers (2024-07-25T01:28:59Z) - A Kernel-Based Conditional Two-Sample Test Using Nearest Neighbors (with Applications to Calibration, Regression Curves, and Simulation-Based Inference) [3.622435665395788]
We introduce a kernel-based measure for detecting differences between two conditional distributions.
When the two conditional distributions are the same, the estimate has a Gaussian limit and its variance has a simple form that can be easily estimated from the data.
We also provide a resampling based test using our estimate that applies to the conditional goodness-of-fit problem.
arXiv Detail & Related papers (2024-07-23T15:04:38Z) - Non-asymptotic Convergence of Discrete-time Diffusion Models: New Approach and Improved Rate [49.97755400231656]
We establish convergence guarantees for substantially larger classes of distributions under DT diffusion processes.
We then specialize our results to a number of interesting classes of distributions with explicit parameter dependencies.
We propose a novel accelerated sampler and show that it improves the convergence rates of the corresponding regular sampler by orders of magnitude with respect to all system parameters.
arXiv Detail & Related papers (2024-02-21T16:11:47Z) - Sequential Predictive Two-Sample and Independence Testing [114.4130718687858]
We study the problems of sequential nonparametric two-sample and independence testing.
We build upon the principle of (nonparametric) testing by betting.
arXiv Detail & Related papers (2023-04-29T01:30:33Z) - DensePure: Understanding Diffusion Models towards Adversarial Robustness [110.84015494617528]
We analyze the properties of diffusion models and establish the conditions under which they can enhance certified robustness.
We propose a new method DensePure, designed to improve the certified robustness of a pretrained model (i.e. a classifier)
We show that this robust region is a union of multiple convex sets, and is potentially much larger than the robust regions identified in previous works.
arXiv Detail & Related papers (2022-11-01T08:18:07Z) - Nonparametric Conditional Local Independence Testing [69.31200003384122]
Conditional local independence is an independence relation among continuous time processes.
No nonparametric test of conditional local independence has been available.
We propose such a nonparametric test based on double machine learning.
arXiv Detail & Related papers (2022-03-25T10:31:02Z) - Wasserstein Generative Learning of Conditional Distribution [6.051520664893158]
We propose a Wasserstein generative approach to learning a conditional distribution.
We establish non-asymptotic error bound of the conditional sampling distribution generated by the proposed method.
arXiv Detail & Related papers (2021-12-19T01:55:01Z) - Adversarial sampling of unknown and high-dimensional conditional
distributions [0.0]
In this paper the sampling method, as well as the inference of the underlying distribution, are handled with a data-driven method known as generative adversarial networks (GAN)
GAN trains two competing neural networks to produce a network that can effectively generate samples from the training set distribution.
It is shown that all the versions of the proposed algorithm effectively sample the target conditional distribution with minimal impact on the quality of the samples.
arXiv Detail & Related papers (2021-11-08T12:23:38Z) - Sampling from Arbitrary Functions via PSD Models [55.41644538483948]
We take a two-step approach by first modeling the probability distribution and then sampling from that model.
We show that these models can approximate a large class of densities concisely using few evaluations, and present a simple algorithm to effectively sample from these models.
arXiv Detail & Related papers (2021-10-20T12:25:22Z) - Two-Sample Testing on Ranked Preference Data and the Role of Modeling
Assumptions [57.77347280992548]
In this paper, we design two-sample tests for pairwise comparison data and ranking data.
Our test requires essentially no assumptions on the distributions.
By applying our two-sample test on real-world pairwise comparison data, we conclude that ratings and rankings provided by people are indeed distributed differently.
arXiv Detail & Related papers (2020-06-21T20:51:09Z) - Testing Goodness of Fit of Conditional Density Models with Kernels [16.003516725803774]
We propose two nonparametric statistical tests of goodness of fit for conditional distributions.
We show that our tests are consistent against any fixed alternative conditional model.
We demonstrate the interpretability of our test on a task of modeling the distribution of New York City's taxi drop-off location.
arXiv Detail & Related papers (2020-02-24T14:04:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.