Universal consistency of Wasserstein $k$-NN classifier: Negative and
Positive Results
- URL: http://arxiv.org/abs/2009.04651v4
- Date: Sun, 26 Jun 2022 18:02:46 GMT
- Title: Universal consistency of Wasserstein $k$-NN classifier: Negative and
Positive Results
- Authors: Donlapark Ponnoprat
- Abstract summary: Wasserstein distance provides a notion of dissimilarities between probability measures.
We show that the $k$-NN classifier is not universally consistent on the space of measures supported in $(0,1)$.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Wasserstein distance provides a notion of dissimilarities between
probability measures, which has recent applications in learning of structured
data with varying size such as images and text documents. In this work, we
study the $k$-nearest neighbor classifier ($k$-NN) of probability measures
under the Wasserstein distance. We show that the $k$-NN classifier is not
universally consistent on the space of measures supported in $(0,1)$. As any
Euclidean ball contains a copy of $(0,1)$, one should not expect to obtain
universal consistency without some restriction on the base metric space, or the
Wasserstein space itself. To this end, via the notion of $\sigma$-finite metric
dimension, we show that the $k$-NN classifier is universally consistent on
spaces of measures supported in a $\sigma$-uniformly discrete set. In addition,
by studying the geodesic structures of the Wasserstein spaces for $p=1$ and
$p=2$, we show that the $k$-NN classifier is universally consistent on the
space of measures supported on a finite set, the space of Gaussian measures,
and the space of measures with densities expressed as finite wavelet series.
Related papers
- Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality.
We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z) - Testing Identity of Distributions under Kolmogorov Distance in Polylogarithmic Space [1.2277343096128712]
In this paper, we show that much less space suffices -- we give an algorithm that uses space $O(log4 varepsilon-1)$ in the streaming setting.
We also state 9 related open problems that we hope will spark interest in this and related problems.
arXiv Detail & Related papers (2024-10-29T15:24:27Z) - Rényi divergence-based uniformity guarantees for $k$-universal hash functions [59.90381090395222]
Universal hash functions map the output of a source to random strings over a finite alphabet.
We show that it is possible to distill random bits that are nearly uniform, as measured by min-entropy.
arXiv Detail & Related papers (2024-10-21T19:37:35Z) - Wasserstein Gradient Flows of MMD Functionals with Distance Kernel and Cauchy Problems on Quantile Functions [2.3301643766310374]
We describe Wasserstein gradient flows of maximum mean discrepancy functionals $mathcal F_nu := textMMD_K2(cdot, nu)$ towards given target measures $nu$ on the real line.
For certain $mathcal F_nu$-flows this implies that initial point measures instantly become absolutely continuous, and stay so over time.
arXiv Detail & Related papers (2024-08-14T12:28:21Z) - Fourier Sliced-Wasserstein Embedding for Multisets and Measures [3.396731589928944]
We present a novel method to embed multisets and measures over $mathbbRd$ into Euclidean space.
We show that our method yields superior representations of input multisets and offers practical advantage for learning on multiset data.
arXiv Detail & Related papers (2024-05-26T11:04:41Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - An Approximation Theory for Metric Space-Valued Functions With A View
Towards Deep Learning [25.25903127886586]
We build universal functions approximators of continuous maps between arbitrary Polish metric spaces $mathcalX$ and $mathcalY$.
In particular, we show that the required number of Dirac measures is determined by the structure of $mathcalX$ and $mathcalY$.
arXiv Detail & Related papers (2023-04-24T16:18:22Z) - Parameterized Approximation Schemes for Clustering with General Norm
Objectives [0.6956677722498498]
This paper considers the well-studied regime of designing a $(k,epsilon)$-approximation algorithm for a $k$-clustering problem.
Our main contribution is a clean and simple EPAS that settles more than ten clustering problems.
arXiv Detail & Related papers (2023-04-06T15:31:37Z) - A Randomized Algorithm to Reduce the Support of Discrete Measures [79.55586575988292]
Given a discrete probability measure supported on $N$ atoms and a set of $n$ real-valued functions, there exists a probability measure that is supported on a subset of $n+1$ of the original $N$ atoms.
We give a simple geometric characterization of barycenters via negative cones and derive a randomized algorithm that computes this new measure by "greedy geometric sampling"
We then study its properties, and benchmark it on synthetic and real-world data to show that it can be very beneficial in the $Ngg n$ regime.
arXiv Detail & Related papers (2020-06-02T16:38:36Z) - A diffusion approach to Stein's method on Riemannian manifolds [65.36007959755302]
We exploit the relationship between the generator of a diffusion on $mathbf M$ with target invariant measure and its characterising Stein operator.
We derive Stein factors, which bound the solution to the Stein equation and its derivatives.
We imply that the bounds for $mathbb Rm$ remain valid when $mathbf M$ is a flat manifold.
arXiv Detail & Related papers (2020-03-25T17:03:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.