Related papers: Targeted Separation and Convergence with Kernel Discrepancies

Targeted Separation and Convergence with Kernel Discrepancies

URL: http://arxiv.org/abs/2209.12835v4
Date: Tue, 22 Oct 2024 12:38:35 GMT
Title: Targeted Separation and Convergence with Kernel Discrepancies
Authors: Alessandro Barp, Carl-Johann Simon-Gabriel, Mark Girolami, Lester Mackey,
Abstract summary: kernel-based discrepancy measures are required to (i) separate a target P from other probability measures or (ii) control weak convergence to P. In this article we derive new sufficient and necessary conditions to ensure (i) and (ii) For MMDs on separable metric spaces, we characterize those kernels that separate Bochner embeddable measures and introduce simple conditions for separating all measures with unbounded kernels.
Score: 61.973643031360254
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Maximum mean discrepancies (MMDs) like the kernel Stein discrepancy (KSD) have grown central to a wide range of applications, including hypothesis testing, sampler selection, distribution approximation, and variational inference. In each setting, these kernel-based discrepancy measures are required to (i) separate a target P from other probability measures or even (ii) control weak convergence to P. In this article we derive new sufficient and necessary conditions to ensure (i) and (ii). For MMDs on separable metric spaces, we characterize those kernels that separate Bochner embeddable measures and introduce simple conditions for separating all measures with unbounded kernels and for controlling convergence with bounded kernels. We use these results on $\mathbb{R}^d$ to substantially broaden the known conditions for KSD separation and convergence control and to develop the first KSDs known to exactly metrize weak convergence to P. Along the way, we highlight the implications of our results for hypothesis testing, measuring and improving sample quality, and sampling with Stein variational gradient descent.

Related papers

A Unified Theory of Stochastic Proximal Point Methods without Smoothness [52.30944052987393]
Proximal point methods have attracted considerable interest owing to their numerical stability and robustness against imperfect tuning. This paper presents a comprehensive analysis of a broad range of variations of the proximal point method (SPPM)
arXiv Detail & Related papers (2024-05-24T21:09:19Z)
A Uniform Concentration Inequality for Kernel-Based Two-Sample Statistics [4.757470449749877]
We show that these metrics can be unified under a general framework of kernel-based two-sample statistics. This paper establishes a novel uniform concentration inequality for the aforementioned kernel-based statistics. As illustrative applications, we demonstrate how these bounds facilitate the component of error bounds for procedures such as distance covariance-based dimension reduction.
arXiv Detail & Related papers (2024-05-22T22:41:56Z)
Controlling Moments with Kernel Stein Discrepancies [74.82363458321939]
Kernel Stein discrepancies (KSDs) measure the quality of a distributional approximation. We first show that standard KSDs used for weak convergence control fail to control moment convergence. We then provide sufficient conditions under which alternative diffusion KSDs control both moment and weak convergence.
arXiv Detail & Related papers (2022-11-10T08:24:52Z)
Variance-Aware Estimation of Kernel Mean Embedding [8.277998582564784]
We show how to speed-up convergence by leveraging variance information in the reproducing kernel Hilbert space. We show that even when such information is a priori unknown, we can efficiently estimate it from the data.
arXiv Detail & Related papers (2022-10-13T01:58:06Z)
Kernel Two-Sample Tests in High Dimension: Interplay Between Moment Discrepancy and Dimension-and-Sample Orders [1.9303929635966661]
We study the behavior of kernel two-sample tests when the dimension and sample sizes both diverge to infinity. We establish the central limit theorem (CLT) under both the null hypothesis and the local and fixed alternatives. The new non-null CLT results allow us to perform exact power analysis, which reveals a delicate interplay between the moment discrepancy that can be detected.
arXiv Detail & Related papers (2021-12-31T23:12:44Z)
Cycle Consistent Probability Divergences Across Different Spaces [38.43511529063335]
Discrepancy measures between probability distributions are at the core of statistical inference and machine learning. This work proposes a novel unbalanced Monge optimal transport formulation for matching, up to isometries, distributions on different spaces.
arXiv Detail & Related papers (2021-11-22T16:35:58Z)
Keep it Tighter -- A Story on Analytical Mean Embeddings [0.6445605125467574]
Kernel techniques are among the most popular and flexible approaches in data science. Mean embedding gives rise to a divergence measure referred to as maximum mean discrepancy (MMD) In this paper we focus on the problem of MMD estimation when the mean embedding of one of the underlying distributions is available analytically.
arXiv Detail & Related papers (2021-10-15T21:29:27Z)
A Note on Optimizing Distributions using Kernel Mean Embeddings [94.96262888797257]
Kernel mean embeddings represent probability measures by their infinite-dimensional mean embeddings in a reproducing kernel Hilbert space. We show that when the kernel is characteristic, distributions with a kernel sum-of-squares density are dense. We provide algorithms to optimize such distributions in the finite-sample setting.
arXiv Detail & Related papers (2021-06-18T08:33:45Z)
A Unified Joint Maximum Mean Discrepancy for Domain Adaptation [73.44809425486767]
This paper theoretically derives a unified form of JMMD that is easy to optimize. From the revealed unified JMMD, we illustrate that JMMD degrades the feature-label dependence that benefits to classification. We propose a novel MMD matrix to promote the dependence, and devise a novel label kernel that is robust to label distribution shift.
arXiv Detail & Related papers (2021-01-25T09:46:14Z)
Generalized Sliced Distances for Probability Distributions [47.543990188697734]
We introduce a broad family of probability metrics, coined as Generalized Sliced Probability Metrics (GSPMs) GSPMs are rooted in the generalized Radon transform and come with a unique geometric interpretation. We consider GSPM-based gradient flows for generative modeling applications and show that under mild assumptions, the gradient flow converges to the global optimum.
arXiv Detail & Related papers (2020-02-28T04:18:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.