MMD-FUSE: Learning and Combining Kernels for Two-Sample Testing Without
Data Splitting
- URL: http://arxiv.org/abs/2306.08777v2
- Date: Sat, 28 Oct 2023 11:53:33 GMT
- Title: MMD-FUSE: Learning and Combining Kernels for Two-Sample Testing Without
Data Splitting
- Authors: Felix Biggs, Antonin Schrab, Arthur Gretton
- Abstract summary: We propose novel statistics which maximise the power of a two-sample test based on the Maximum Mean Discrepancy (MMD)
We show how these kernels can be chosen in a data-dependent but permutation-independent way, in a well-calibrated test, avoiding data splitting.
We highlight the applicability of our MMD-FUSE test on both synthetic low-dimensional and real-world high-dimensional data, and compare its performance in terms of power against current state-of-the-art kernel tests.
- Score: 28.59390881834003
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We propose novel statistics which maximise the power of a two-sample test
based on the Maximum Mean Discrepancy (MMD), by adapting over the set of
kernels used in defining it. For finite sets, this reduces to combining
(normalised) MMD values under each of these kernels via a weighted soft
maximum. Exponential concentration bounds are proved for our proposed
statistics under the null and alternative. We further show how these kernels
can be chosen in a data-dependent but permutation-independent way, in a
well-calibrated test, avoiding data splitting. This technique applies more
broadly to general permutation-based MMD testing, and includes the use of deep
kernels with features learnt using unsupervised models such as auto-encoders.
We highlight the applicability of our MMD-FUSE test on both synthetic
low-dimensional and real-world high-dimensional data, and compare its
performance in terms of power against current state-of-the-art kernel tests.
Related papers
- Robust Kernel Hypothesis Testing under Data Corruption [6.430258446597413]
We propose two general methods for constructing robust permutation tests under data corruption.
We prove their consistency in power under minimal conditions.
This contributes to the practical deployment of hypothesis tests for real-world applications with potential adversarial attacks.
arXiv Detail & Related papers (2024-05-30T10:23:16Z) - Boosting the Power of Kernel Two-Sample Tests [4.07125466598411]
A kernel two-sample test based on the maximum mean discrepancy (MMD) is one of the most popular methods for detecting differences between two distributions over general metric spaces.
We propose a method to boost the power of the kernel test by combining MMD estimates over multiple kernels using their Mahalanobis distance.
arXiv Detail & Related papers (2023-02-21T14:14:30Z) - Spectral Regularized Kernel Two-Sample Tests [7.915420897195129]
We show the popular MMD (maximum mean discrepancy) two-sample test to be not optimal in terms of the separation boundary measured in Hellinger distance.
We propose a modification to the MMD test based on spectral regularization and prove the proposed test to be minimax optimal with a smaller separation boundary than that achieved by the MMD test.
Our results hold for the permutation variant of the test where the test threshold is chosen elegantly through the permutation of the samples.
arXiv Detail & Related papers (2022-12-19T00:42:21Z) - FaDIn: Fast Discretized Inference for Hawkes Processes with General
Parametric Kernels [82.53569355337586]
This work offers an efficient solution to temporal point processes inference using general parametric kernels with finite support.
The method's effectiveness is evaluated by modeling the occurrence of stimuli-induced patterns from brain signals recorded with magnetoencephalography (MEG)
Results show that the proposed approach leads to an improved estimation of pattern latency than the state-of-the-art.
arXiv Detail & Related papers (2022-10-10T12:35:02Z) - Targeted Separation and Convergence with Kernel Discrepancies [61.973643031360254]
kernel-based discrepancy measures are required to (i) separate a target P from other probability measures or (ii) control weak convergence to P.
In this article we derive new sufficient and necessary conditions to ensure (i) and (ii)
For MMDs on separable metric spaces, we characterize those kernels that separate Bochner embeddable measures and introduce simple conditions for separating all measures with unbounded kernels.
arXiv Detail & Related papers (2022-09-26T16:41:16Z) - MMD Aggregated Two-Sample Test [31.116276769013204]
We propose two novel non-parametric two-sample kernel tests based on the Mean Maximum Discrepancy (MMD)
First, for a fixed kernel, we construct an MMD test using either permutations or a wild bootstrap, two popular numerical procedures to determine the test threshold.
We prove that this test controls the level non-asymptotically, and achieves the minimax rate over Sobolev balls, up to an iterated logarithmic term.
arXiv Detail & Related papers (2021-10-28T12:47:49Z) - A Note on Optimizing Distributions using Kernel Mean Embeddings [94.96262888797257]
Kernel mean embeddings represent probability measures by their infinite-dimensional mean embeddings in a reproducing kernel Hilbert space.
We show that when the kernel is characteristic, distributions with a kernel sum-of-squares density are dense.
We provide algorithms to optimize such distributions in the finite-sample setting.
arXiv Detail & Related papers (2021-06-18T08:33:45Z) - Maximum Mean Discrepancy Test is Aware of Adversarial Attacks [122.51040127438324]
The maximum mean discrepancy (MMD) test could in principle detect any distributional discrepancy between two datasets.
It has been shown that the MMD test is unaware of adversarial attacks.
arXiv Detail & Related papers (2020-10-22T03:42:12Z) - Kernel learning approaches for summarising and combining posterior
similarity matrices [68.8204255655161]
We build upon the notion of the posterior similarity matrix (PSM) in order to suggest new approaches for summarising the output of MCMC algorithms for Bayesian clustering models.
A key contribution of our work is the observation that PSMs are positive semi-definite, and hence can be used to define probabilistically-motivated kernel matrices.
arXiv Detail & Related papers (2020-09-27T14:16:14Z) - Learning Kernel Tests Without Data Splitting [18.603394415852765]
We propose an approach that enables learning the hyper parameters and testing on the full sample without data splitting.
Our approach's test power is empirically larger than that of the data-splitting approach, regardless of its split proportion.
arXiv Detail & Related papers (2020-06-03T14:07:39Z) - Learning Deep Kernels for Non-Parametric Two-Sample Tests [50.92621794426821]
We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution.
Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test power.
arXiv Detail & Related papers (2020-02-21T03:54:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.