Learning Deep Kernels for Non-Parametric Two-Sample Tests
- URL: http://arxiv.org/abs/2002.09116v3
- Date: Thu, 14 Jan 2021 05:29:18 GMT
- Title: Learning Deep Kernels for Non-Parametric Two-Sample Tests
- Authors: Feng Liu, Wenkai Xu, Jie Lu, Guangquan Zhang, Arthur Gretton, Danica
J. Sutherland
- Abstract summary: We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution.
Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test power.
- Score: 50.92621794426821
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a class of kernel-based two-sample tests, which aim to determine
whether two sets of samples are drawn from the same distribution. Our tests are
constructed from kernels parameterized by deep neural nets, trained to maximize
test power. These tests adapt to variations in distribution smoothness and
shape over space, and are especially suited to high dimensions and complex
data. By contrast, the simpler kernels used in prior kernel testing work are
spatially homogeneous, and adaptive only in lengthscale. We explain how this
scheme includes popular classifier-based two-sample tests as a special case,
but improves on them in general. We provide the first proof of consistency for
the proposed adaptation method, which applies both to kernels on deep features
and to simpler radial basis kernels or multiple kernel learning. In
experiments, we establish the superior performance of our deep kernels in
hypothesis testing on benchmark and real-world data. The code of our
deep-kernel-based two sample tests is available at
https://github.com/fengliu90/DK-for-TST.
Related papers
- Collaborative non-parametric two-sample testing [55.98760097296213]
The goal is to identify nodes where the null hypothesis $p_v = q_v$ should be rejected.
We propose the non-parametric collaborative two-sample testing (CTST) framework that efficiently leverages the graph structure.
Our methodology integrates elements from f-divergence estimation, Kernel Methods, and Multitask Learning.
arXiv Detail & Related papers (2024-02-08T14:43:56Z) - MMD-FUSE: Learning and Combining Kernels for Two-Sample Testing Without
Data Splitting [28.59390881834003]
We propose novel statistics which maximise the power of a two-sample test based on the Maximum Mean Discrepancy (MMD)
We show how these kernels can be chosen in a data-dependent but permutation-independent way, in a well-calibrated test, avoiding data splitting.
We highlight the applicability of our MMD-FUSE test on both synthetic low-dimensional and real-world high-dimensional data, and compare its performance in terms of power against current state-of-the-art kernel tests.
arXiv Detail & Related papers (2023-06-14T23:13:03Z) - Boosting the Power of Kernel Two-Sample Tests [4.07125466598411]
A kernel two-sample test based on the maximum mean discrepancy (MMD) is one of the most popular methods for detecting differences between two distributions over general metric spaces.
We propose a method to boost the power of the kernel test by combining MMD estimates over multiple kernels using their Mahalanobis distance.
arXiv Detail & Related papers (2023-02-21T14:14:30Z) - Variational Autoencoder Kernel Interpretation and Selection for
Classification [59.30734371401315]
This work proposed kernel selection approaches for probabilistic classifiers based on features produced by the convolutional encoder of a variational autoencoder.
In the proposed implementation, each latent variable was sampled from the distribution associated with a single kernel of the last encoder's convolution layer, as an individual distribution was created for each kernel.
choosing relevant features on the sampled latent variables makes it possible to perform kernel selection, filtering the uninformative features and kernels.
arXiv Detail & Related papers (2022-09-10T17:22:53Z) - MMD Aggregated Two-Sample Test [31.116276769013204]
We propose two novel non-parametric two-sample kernel tests based on the Mean Maximum Discrepancy (MMD)
First, for a fixed kernel, we construct an MMD test using either permutations or a wild bootstrap, two popular numerical procedures to determine the test threshold.
We prove that this test controls the level non-asymptotically, and achieves the minimax rate over Sobolev balls, up to an iterated logarithmic term.
arXiv Detail & Related papers (2021-10-28T12:47:49Z) - Kernel Identification Through Transformers [54.3795894579111]
Kernel selection plays a central role in determining the performance of Gaussian Process (GP) models.
This work addresses the challenge of constructing custom kernel functions for high-dimensional GP regression models.
We introduce a novel approach named KITT: Kernel Identification Through Transformers.
arXiv Detail & Related papers (2021-06-15T14:32:38Z) - Meta Two-Sample Testing: Learning Kernels for Testing with Limited Data [21.596650236820377]
We introduce the problem of meta two-sample testing (M2ST)
M2ST aims to exploit (abundant) auxiliary data on related tasks to find an algorithm that can quickly identify a powerful test on new target tasks.
We provide both theoretical justification and empirical evidence that our proposed meta-testing schemes out-perform learning kernel-based tests directly from scarce observations.
arXiv Detail & Related papers (2021-06-14T17:52:50Z) - Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network.
We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z) - An Optimal Witness Function for Two-Sample Testing [13.159512679346685]
We propose data-dependent test statistics based on a one-dimensional witness function, which we call witness two-sample tests (WiTS)
We show that the WiTS test based on a characteristic kernel is consistent against any fixed alternative.
arXiv Detail & Related papers (2021-02-10T17:13:21Z) - Isolation Distributional Kernel: A New Tool for Point & Group Anomaly
Detection [76.1522587605852]
Isolation Distributional Kernel (IDK) is a new way to measure the similarity between two distributions.
We demonstrate IDK's efficacy and efficiency as a new tool for kernel based anomaly detection for both point and group anomalies.
arXiv Detail & Related papers (2020-09-24T12:25:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.