Kernel conditional tests from learning-theoretic bounds
- URL: http://arxiv.org/abs/2506.03898v2
- Date: Fri, 31 Oct 2025 17:19:02 GMT
- Title: Kernel conditional tests from learning-theoretic bounds
- Authors: Pierre-François Massiani, Christian Fiedler, Lukas Haverbeck, Friedrich Solowjow, Sebastian Trimpe,
- Abstract summary: We propose a framework for hypothesis testing on conditional probability distributions.<n>We then use to construct statistical tests of functionals of conditional distributions.<n>Our results establish a comprehensive foundation for conditional testing on functionals.
- Score: 16.813275168865953
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a framework for hypothesis testing on conditional probability distributions, which we then use to construct statistical tests of functionals of conditional distributions. These tests identify the inputs where the functionals differ with high probability, and include tests of conditional moments or two-sample tests. Our key idea is to transform confidence bounds of a learning method into a test of conditional expectations. We instantiate this principle for kernel ridge regression (KRR) with subgaussian noise. An intermediate data embedding then enables more general tests -- including conditional two-sample tests -- via kernel mean embeddings of distributions. To have guarantees in this setting, we generalize existing pointwise-in-time or time-uniform confidence bounds for KRR to previously-inaccessible yet essential cases such as infinite-dimensional outputs with non-trace-class kernels. These bounds also circumvent the need for independent data, allowing for instance online sampling. To make our tests readily applicable in practice, we introduce bootstrapping schemes leveraging the parametric form of testing thresholds identified in theory to avoid tuning inaccessible parameters. We illustrate the tests on examples, including one in process monitoring and comparison of dynamical systems. Overall, our results establish a comprehensive foundation for conditional testing on functionals, from theoretical guarantees to an algorithmic implementation, and advance the state of the art on confidence bounds for vector-valued least squares estimation.
Related papers
- A Sample Efficient Conditional Independence Test in the Presence of Discretization [54.047334792855345]
Conditional Independence (CI) tests directly to discretized data can lead to incorrect conclusions.<n>Recent advancements have sought to infer the correct CI relationship between the latent variables through binarizing observed data.<n>Motivated by this, this paper introduces a sample-efficient CI test that does not rely on the binarization process.
arXiv Detail & Related papers (2025-06-10T12:41:26Z) - Internal Incoherency Scores for Constraint-based Causal Discovery Algorithms [12.524536193679124]
We propose internal coherency scores that allow testing for assumption violations and finite sample errors.<n>We illustrate our coherency scores on the PC algorithm with simulated and real-world datasets.
arXiv Detail & Related papers (2025-02-20T16:44:54Z) - General Frameworks for Conditional Two-Sample Testing [3.3317825075368908]
We study the problem of conditional two-sample testing, which aims to determine whether two populations have the same distribution after accounting for confounding factors.
This problem commonly arises in various applications, such as domain adaptation and algorithmic fairness.
We introduce two general frameworks that implicitly or explicitly target specific classes of distributions for their validity and power.
arXiv Detail & Related papers (2024-10-22T02:27:32Z) - Conditional Testing based on Localized Conformal p-values [5.6779147365057305]
We define the localized conformal p-values by inverting prediction intervals and prove their theoretical properties.
These defined p-values are then applied to several conditional testing problems to illustrate their practicality.
arXiv Detail & Related papers (2024-09-25T11:30:14Z) - A Kernel-Based Conditional Two-Sample Test Using Nearest Neighbors (with Applications to Calibration, Regression Curves, and Simulation-Based Inference) [3.622435665395788]
We introduce a kernel-based measure for detecting differences between two conditional distributions.
When the two conditional distributions are the same, the estimate has a Gaussian limit and its variance has a simple form that can be easily estimated from the data.
We also provide a resampling based test using our estimate that applies to the conditional goodness-of-fit problem.
arXiv Detail & Related papers (2024-07-23T15:04:38Z) - Probabilistic Conformal Prediction with Approximate Conditional Validity [81.30551968980143]
We develop a new method for generating prediction sets that combines the flexibility of conformal methods with an estimate of the conditional distribution.
Our method consistently outperforms existing approaches in terms of conditional coverage.
arXiv Detail & Related papers (2024-07-01T20:44:48Z) - Robust Kernel Hypothesis Testing under Data Corruption [6.430258446597413]
We propose a general method for constructing robust permutation tests under data corruption.<n>We prove their consistency in power under minimal conditions.<n>This contributes to the practical deployment of hypothesis tests for real-world applications with potential adversarial attacks.
arXiv Detail & Related papers (2024-05-30T10:23:16Z) - Precise Error Rates for Computationally Efficient Testing [67.30044609837749]
We revisit the question of simple-versus-simple hypothesis testing with an eye towards computational complexity.<n>An existing test based on linear spectral statistics achieves the best possible tradeoff curve between type I and type II error rates.
arXiv Detail & Related papers (2023-11-01T04:41:16Z) - Deep anytime-valid hypothesis testing [29.273915933729057]
We propose a general framework for constructing powerful, sequential hypothesis tests for nonparametric testing problems.
We develop a principled approach of leveraging the representation capability of machine learning models within the testing-by-betting framework.
Empirical results on synthetic and real-world datasets demonstrate that tests instantiated using our general framework are competitive against specialized baselines.
arXiv Detail & Related papers (2023-10-30T09:46:19Z) - Sequential Predictive Two-Sample and Independence Testing [114.4130718687858]
We study the problems of sequential nonparametric two-sample and independence testing.
We build upon the principle of (nonparametric) testing by betting.
arXiv Detail & Related papers (2023-04-29T01:30:33Z) - Sequential Kernelized Independence Testing [77.237958592189]
We design sequential kernelized independence tests inspired by kernelized dependence measures.<n>We demonstrate the power of our approaches on both simulated and real data.
arXiv Detail & Related papers (2022-12-14T18:08:42Z) - Model-Free Sequential Testing for Conditional Independence via Testing
by Betting [8.293345261434943]
The proposed test allows researchers to analyze an incoming i.i.d. data stream with any arbitrary dependency structure.
We allow the processing of data points online as soon as they arrive and stop data acquisition once significant results are detected.
arXiv Detail & Related papers (2022-10-01T20:05:33Z) - Robust Continual Test-time Adaptation: Instance-aware BN and
Prediction-balanced Memory [58.72445309519892]
We present a new test-time adaptation scheme that is robust against non-i.i.d. test data streams.
Our novelty is mainly two-fold: (a) Instance-Aware Batch Normalization (IABN) that corrects normalization for out-of-distribution samples, and (b) Prediction-balanced Reservoir Sampling (PBRS) that simulates i.i.d. data stream from non-i.i.d. stream in a class-balanced manner.
arXiv Detail & Related papers (2022-08-10T03:05:46Z) - Sequential Permutation Testing of Random Forest Variable Importance
Measures [68.8204255655161]
It is proposed here to use sequential permutation tests and sequential p-value estimation to reduce the high computational costs associated with conventional permutation tests.
The results of simulation studies confirm that the theoretical properties of the sequential tests apply.
The numerical stability of the methods is investigated in two additional application studies.
arXiv Detail & Related papers (2022-06-02T20:16:50Z) - Nonparametric Conditional Local Independence Testing [69.31200003384122]
Conditional local independence is an independence relation among continuous time processes.
No nonparametric test of conditional local independence has been available.
We propose such a nonparametric test based on double machine learning.
arXiv Detail & Related papers (2022-03-25T10:31:02Z) - Double Generative Adversarial Networks for Conditional Independence
Testing [8.359770027722275]
High-dimensional conditional independence testing is a key building block in statistics and machine learning.
We propose an inferential procedure based on double generative adversarial networks (GANs)
arXiv Detail & Related papers (2020-06-03T16:14:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.