Practical Kernel Tests of Conditional Independence
- URL: http://arxiv.org/abs/2402.13196v2
- Date: Fri, 19 Sep 2025 19:11:44 GMT
- Title: Practical Kernel Tests of Conditional Independence
- Authors: Roman Pogodin, Antonin Schrab, Yazhe Li, Danica J. Sutherland, Arthur Gretton,
- Abstract summary: SplitKCI is an automated method for bias control for the Kernel-based Conditional Independence test based on data splitting.<n>We show that our approach significantly improves test level control for KCI without sacrificing test power.
- Score: 33.275712245547815
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We describe a data-efficient, kernel-based approach to statistical testing of conditional independence. A major challenge of conditional independence testing is to obtain the correct test level (the specified upper bound on the rate of false positives), while still attaining competitive test power. Excess false positives arise due to bias in the test statistic, which is in our case obtained using nonparametric kernel ridge regression. We propose SplitKCI, an automated method for bias control for the Kernel-based Conditional Independence (KCI) test based on data splitting. We show that our approach significantly improves test level control for KCI without sacrificing test power, both theoretically and for synthetic and real-world data.
Related papers
- On the Hardness of Conditional Independence Testing In Practice [33.26934394515333]
Tests of conditional independence (CI) underpin a number of important problems in machine learning and statistics.<n>Shah and Peters ( 2020) showed that, contrary to the unconditional case, no universally finite-sample valid test can ever achieve nontrivial power.<n>We investigate the Kernel-based Conditional Independence (KCI) test and identify the major factors underlying its practical behavior.
arXiv Detail & Related papers (2025-12-16T01:45:23Z) - A Sample Efficient Conditional Independence Test in the Presence of Discretization [54.047334792855345]
Conditional Independence (CI) tests directly to discretized data can lead to incorrect conclusions.<n>Recent advancements have sought to infer the correct CI relationship between the latent variables through binarizing observed data.<n>Motivated by this, this paper introduces a sample-efficient CI test that does not rely on the binarization process.
arXiv Detail & Related papers (2025-06-10T12:41:26Z) - A Fast Kernel-based Conditional Independence test with Application to Causal Discovery [9.416064439922001]
FastKCI is a scalable and parallelizable kernel-based conditional independence test.<n>Experiments on synthetic datasets and benchmarks on real-world production data validate that FastKCI maintains the statistical power of the original KCI test.
arXiv Detail & Related papers (2025-05-16T10:14:57Z) - Conditional Independence Test Based on Transport Maps [9.039406432084578]
We propose a novel framework for testing conditional independence using transport maps.
At the population level, we show that two well-defined transport maps can transform the conditional independence test into an unconditional independence test.
A permutation-based procedure is employed to evaluate the significance of the test.
arXiv Detail & Related papers (2025-04-13T13:38:25Z) - Robust Kernel Hypothesis Testing under Data Corruption [6.430258446597413]
We propose a general method for constructing robust permutation tests under data corruption.<n>We prove their consistency in power under minimal conditions.<n>This contributes to the practical deployment of hypothesis tests for real-world applications with potential adversarial attacks.
arXiv Detail & Related papers (2024-05-30T10:23:16Z) - Precise Error Rates for Computationally Efficient Testing [75.63895690909241]
We revisit the question of simple-versus-simple hypothesis testing with an eye towards computational complexity.
An existing test based on linear spectral statistics achieves the best possible tradeoff curve between type I and type II error rates.
arXiv Detail & Related papers (2023-11-01T04:41:16Z) - Sequential Predictive Two-Sample and Independence Testing [114.4130718687858]
We study the problems of sequential nonparametric two-sample and independence testing.
We build upon the principle of (nonparametric) testing by betting.
arXiv Detail & Related papers (2023-04-29T01:30:33Z) - Sequential Kernelized Independence Testing [101.22966794822084]
We design sequential kernelized independence tests inspired by kernelized dependence measures.
We demonstrate the power of our approaches on both simulated and real data.
arXiv Detail & Related papers (2022-12-14T18:08:42Z) - Model-Free Sequential Testing for Conditional Independence via Testing
by Betting [8.293345261434943]
The proposed test allows researchers to analyze an incoming i.i.d. data stream with any arbitrary dependency structure.
We allow the processing of data points online as soon as they arrive and stop data acquisition once significant results are detected.
arXiv Detail & Related papers (2022-10-01T20:05:33Z) - Nonparametric Conditional Local Independence Testing [69.31200003384122]
Conditional local independence is an independence relation among continuous time processes.
No nonparametric test of conditional local independence has been available.
We propose such a nonparametric test based on double machine learning.
arXiv Detail & Related papers (2022-03-25T10:31:02Z) - An $\ell^p$-based Kernel Conditional Independence Test [21.689461247198388]
We propose a new computationally efficient test for conditional independence based on the $Lp$ distance between two kernel-based representatives of well suited distributions.
We conduct a series of experiments showing that the performance of our new tests outperforms state-of-the-art methods both in term of statistical power and type-I error even in the high dimensional setting.
arXiv Detail & Related papers (2021-10-28T03:18:27Z) - Testing for Outliers with Conformal p-values [14.158078752410182]
The goal is to test whether new independent samples belong to the same distribution as a reference data set or are outliers.
We propose a solution based on conformal inference, a broadly applicable framework which yields p-values that are marginally valid but mutually dependent for different test points.
We prove these p-values are positively dependent and enable exact false discovery rate control, although in a relatively weak marginal sense.
arXiv Detail & Related papers (2021-04-16T17:59:21Z) - Cross-validation Confidence Intervals for Test Error [83.67415139421448]
This work develops central limit theorems for crossvalidation and consistent estimators of its variance under weak stability conditions on the learning algorithm.
Results are the first of their kind for the popular choice of leave-one-out cross-validation.
arXiv Detail & Related papers (2020-07-24T17:40:06Z) - Learning Kernel Tests Without Data Splitting [18.603394415852765]
We propose an approach that enables learning the hyper parameters and testing on the full sample without data splitting.
Our approach's test power is empirically larger than that of the data-splitting approach, regardless of its split proportion.
arXiv Detail & Related papers (2020-06-03T14:07:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.