Generalised Kernel Stein Discrepancy(GKSD): A Unifying Approach for
Non-parametric Goodness-of-fit Testing
- URL: http://arxiv.org/abs/2106.12105v1
- Date: Wed, 23 Jun 2021 00:44:31 GMT
- Title: Generalised Kernel Stein Discrepancy(GKSD): A Unifying Approach for
Non-parametric Goodness-of-fit Testing
- Authors: Wenkai Xu
- Abstract summary: Non-parametric goodness-of-fit testing procedures based on kernel Stein discrepancies (KSD) are promising approaches to validate general unnormalised distributions.
We propose a unifying framework, the generalised kernel Stein discrepancy (GKSD), to theoretically compare and interpret different Stein operators in performing the KSD-based goodness-of-fit tests.
- Score: 5.885020100736158
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Non-parametric goodness-of-fit testing procedures based on kernel Stein
discrepancies (KSD) are promising approaches to validate general unnormalised
distributions in various scenarios. Existing works have focused on studying
optimal kernel choices to boost test performances. However, the Stein operators
are generally non-unique, while different choices of Stein operators can also
have considerable effect on the test performances. In this work, we propose a
unifying framework, the generalised kernel Stein discrepancy (GKSD), to
theoretically compare and interpret different Stein operators in performing the
KSD-based goodness-of-fit tests. We derive explicitly that how the proposed
GKSD framework generalises existing Stein operators and their corresponding
tests. In addition, we show thatGKSD framework can be used as a guide to
develop kernel-based non-parametric goodness-of-fit tests for complex new data
scenarios, e.g. truncated distributions or compositional data. Experimental
results demonstrate that the proposed tests control type-I error well and
achieve higher test power than existing approaches, including the test based on
maximum-mean-discrepancy (MMD).
Related papers
- Minimax Optimal Goodness-of-Fit Testing with Kernel Stein Discrepancy [13.429541377715298]
We explore the minimax optimality of goodness-of-fit tests on general domains using the kernelized Stein discrepancy (KSD)
The KSD framework offers a flexible approach for goodness-of-fit testing, avoiding strong distributional assumptions.
We introduce an adaptive test capable of achieving minimax optimality up to a logarithmic factor by adapting to unknown parameters.
arXiv Detail & Related papers (2024-04-12T07:06:12Z) - Precise Error Rates for Computationally Efficient Testing [75.63895690909241]
We revisit the question of simple-versus-simple hypothesis testing with an eye towards computational complexity.
An existing test based on linear spectral statistics achieves the best possible tradeoff curve between type I and type II error rates.
arXiv Detail & Related papers (2023-11-01T04:41:16Z) - Using Perturbation to Improve Goodness-of-Fit Tests based on Kernelized
Stein Discrepancy [3.78967502155084]
Kernelized Stein discrepancy (KSD) is a score-based discrepancy widely used in goodness-of-fit tests.
We show theoretically and empirically that the KSD test can suffer from low power when the target and the alternative distributions have the same well-separated modes but differ in mixing proportions.
arXiv Detail & Related papers (2023-04-28T11:13:18Z) - A kernel Stein test of goodness of fit for sequential models [19.8408003104988]
The proposed measure is an instance of the kernel Stein discrepancy (KSD), which has been used to construct goodness-of-fit tests for unnormalized densities.
We extend the KSD to the variable-dimension setting by identifying appropriate Stein operators, and propose a novel KSD goodness-of-fit test.
Our test is shown to perform well in practice on discrete sequential data benchmarks.
arXiv Detail & Related papers (2022-10-19T17:30:15Z) - On RKHS Choices for Assessing Graph Generators via Kernel Stein
Statistics [8.987015146366216]
We assess the effect of RKHS choice for KSD tests of random networks models.
We investigate the power performance and the computational runtime of the test in different scenarios.
arXiv Detail & Related papers (2022-10-11T19:23:33Z) - A Fourier representation of kernel Stein discrepancy with application to
Goodness-of-Fit tests for measures on infinite dimensional Hilbert spaces [6.437931786032493]
Kernel Stein discrepancy (KSD) is a kernel-based measure of discrepancy between probability measures.
We provide the first analysis of KSD in the generality of data lying in a separable Hilbert space.
This allows us to prove that KSD can separate measures and thus is valid to use in practice.
arXiv Detail & Related papers (2022-06-09T15:04:18Z) - Are Missing Links Predictable? An Inferential Benchmark for Knowledge
Graph Completion [79.07695173192472]
InferWiki improves upon existing benchmarks in inferential ability, assumptions, and patterns.
Each testing sample is predictable with supportive data in the training set.
In experiments, we curate two settings of InferWiki varying in sizes and structures, and apply the construction process on CoDEx as comparative datasets.
arXiv Detail & Related papers (2021-08-03T09:51:15Z) - Comprehensible Counterfactual Explanation on Kolmogorov-Smirnov Test [56.5373227424117]
We tackle the problem of producing counterfactual explanations for test data failing the Kolmogorov-Smirnov (KS) test.
We develop an efficient algorithm MOCHE that avoids enumerating and checking an exponential number of subsets of the test set failing the KS test.
arXiv Detail & Related papers (2020-11-01T06:46:01Z) - Noisy Adaptive Group Testing using Bayesian Sequential Experimental
Design [63.48989885374238]
When the infection prevalence of a disease is low, Dorfman showed 80 years ago that testing groups of people can prove more efficient than testing people individually.
Our goal in this paper is to propose new group testing algorithms that can operate in a noisy setting.
arXiv Detail & Related papers (2020-04-26T23:41:33Z) - Learning Deep Kernels for Non-Parametric Two-Sample Tests [50.92621794426821]
We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution.
Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test power.
arXiv Detail & Related papers (2020-02-21T03:54:23Z) - A Kernel Stein Test for Comparing Latent Variable Models [48.32146056855925]
We propose a kernel-based nonparametric test of relative goodness of fit, where the goal is to compare two models, both of which may have unobserved latent variables.
We show that our test significantly outperforms the relative Maximum Mean Discrepancy test, which is based on samples from the models and does not exploit the latent structure.
arXiv Detail & Related papers (2019-07-01T07:46:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.