Minimax Optimal Goodness-of-Fit Testing with Kernel Stein Discrepancy
- URL: http://arxiv.org/abs/2404.08278v2
- Date: Tue, 21 May 2024 00:42:23 GMT
- Title: Minimax Optimal Goodness-of-Fit Testing with Kernel Stein Discrepancy
- Authors: Omar Hagrass, Bharath Sriperumbudur, Krishnakumar Balasubramanian,
- Abstract summary: We explore the minimax optimality of goodness-of-fit tests on general domains using the kernelized Stein discrepancy (KSD)
The KSD framework offers a flexible approach for goodness-of-fit testing, avoiding strong distributional assumptions.
We introduce an adaptive test capable of achieving minimax optimality up to a logarithmic factor by adapting to unknown parameters.
- Score: 13.429541377715298
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We explore the minimax optimality of goodness-of-fit tests on general domains using the kernelized Stein discrepancy (KSD). The KSD framework offers a flexible approach for goodness-of-fit testing, avoiding strong distributional assumptions, accommodating diverse data structures beyond Euclidean spaces, and relying only on partial knowledge of the reference distribution, while maintaining computational efficiency. We establish a general framework and an operator-theoretic representation of the KSD, encompassing many existing KSD tests in the literature, which vary depending on the domain. We reveal the characteristics and limitations of KSD and demonstrate its non-optimality under a certain alternative space, defined over general domains when considering $\chi^2$-divergence as the separation metric. To address this issue of non-optimality, we propose a modified, minimax optimal test by incorporating a spectral regularizer, thereby overcoming the shortcomings of standard KSD tests. Our results are established under a weak moment condition on the Stein kernel, which relaxes the bounded kernel assumption required by prior work in the analysis of kernel-based hypothesis testing. Additionally, we introduce an adaptive test capable of achieving minimax optimality up to a logarithmic factor by adapting to unknown parameters. Through numerical experiments, we illustrate the superior performance of our proposed tests across various domains compared to their unregularized counterparts.
Related papers
- Robust Kernel Hypothesis Testing under Data Corruption [6.430258446597413]
We propose two general methods for constructing robust permutation tests under data corruption.
We prove their consistency in power under minimal conditions.
This contributes to the practical deployment of hypothesis tests for real-world applications with potential adversarial attacks.
arXiv Detail & Related papers (2024-05-30T10:23:16Z) - Precise Error Rates for Computationally Efficient Testing [75.63895690909241]
We revisit the question of simple-versus-simple hypothesis testing with an eye towards computational complexity.
An existing test based on linear spectral statistics achieves the best possible tradeoff curve between type I and type II error rates.
arXiv Detail & Related papers (2023-11-01T04:41:16Z) - Spectral Regularized Kernel Two-Sample Tests [7.915420897195129]
We show the popular MMD (maximum mean discrepancy) two-sample test to be not optimal in terms of the separation boundary measured in Hellinger distance.
We propose a modification to the MMD test based on spectral regularization and prove the proposed test to be minimax optimal with a smaller separation boundary than that achieved by the MMD test.
Our results hold for the permutation variant of the test where the test threshold is chosen elegantly through the permutation of the samples.
arXiv Detail & Related papers (2022-12-19T00:42:21Z) - Controlling Moments with Kernel Stein Discrepancies [74.82363458321939]
Kernel Stein discrepancies (KSDs) measure the quality of a distributional approximation.
We first show that standard KSDs used for weak convergence control fail to control moment convergence.
We then provide sufficient conditions under which alternative diffusion KSDs control both moment and weak convergence.
arXiv Detail & Related papers (2022-11-10T08:24:52Z) - Targeted Separation and Convergence with Kernel Discrepancies [61.973643031360254]
kernel-based discrepancy measures are required to (i) separate a target P from other probability measures or (ii) control weak convergence to P.
In this article we derive new sufficient and necessary conditions to ensure (i) and (ii)
For MMDs on separable metric spaces, we characterize those kernels that separate Bochner embeddable measures and introduce simple conditions for separating all measures with unbounded kernels.
arXiv Detail & Related papers (2022-09-26T16:41:16Z) - A Fourier representation of kernel Stein discrepancy with application to
Goodness-of-Fit tests for measures on infinite dimensional Hilbert spaces [6.437931786032493]
Kernel Stein discrepancy (KSD) is a kernel-based measure of discrepancy between probability measures.
We provide the first analysis of KSD in the generality of data lying in a separable Hilbert space.
This allows us to prove that KSD can separate measures and thus is valid to use in practice.
arXiv Detail & Related papers (2022-06-09T15:04:18Z) - Experimental Design for Linear Functionals in Reproducing Kernel Hilbert
Spaces [102.08678737900541]
We provide algorithms for constructing bias-aware designs for linear functionals.
We derive non-asymptotic confidence sets for fixed and adaptive designs under sub-Gaussian noise.
arXiv Detail & Related papers (2022-05-26T20:56:25Z) - KSD Aggregated Goodness-of-fit Test [38.45086141837479]
We introduce a strategy to construct a test, called KSDAgg, which aggregates multiple tests with different kernels.
We provide non-asymptotic guarantees on the power of KSDAgg.
We find that KSDAgg outperforms other state-of-the-art adaptive KSD-based goodness-of-fit testing procedures.
arXiv Detail & Related papers (2022-02-02T00:33:09Z) - Generalised Kernel Stein Discrepancy(GKSD): A Unifying Approach for
Non-parametric Goodness-of-fit Testing [5.885020100736158]
Non-parametric goodness-of-fit testing procedures based on kernel Stein discrepancies (KSD) are promising approaches to validate general unnormalised distributions.
We propose a unifying framework, the generalised kernel Stein discrepancy (GKSD), to theoretically compare and interpret different Stein operators in performing the KSD-based goodness-of-fit tests.
arXiv Detail & Related papers (2021-06-23T00:44:31Z) - Learning Deep Kernels for Non-Parametric Two-Sample Tests [50.92621794426821]
We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution.
Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test power.
arXiv Detail & Related papers (2020-02-21T03:54:23Z) - A Kernel Stein Test for Comparing Latent Variable Models [48.32146056855925]
We propose a kernel-based nonparametric test of relative goodness of fit, where the goal is to compare two models, both of which may have unobserved latent variables.
We show that our test significantly outperforms the relative Maximum Mean Discrepancy test, which is based on samples from the models and does not exploit the latent structure.
arXiv Detail & Related papers (2019-07-01T07:46:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.