Related papers: A New Robust Partial $p$-Wasserstein-Based Metric for Comparing Distributions

A New Robust Partial $p$-Wasserstein-Based Metric for Comparing Distributions

URL: http://arxiv.org/abs/2405.03664v2
Date: Mon, 3 Jun 2024 02:50:35 GMT
Title: A New Robust Partial $p$-Wasserstein-Based Metric for Comparing Distributions
Authors: Sharath Raghvendra, Pouyan Shirzadian, Kaiyi Zhang,
Abstract summary: We introduce a new family of distances parameterized by $k ge 0$, called $k$-RPW that is based on computing the partial $2$-Wasserstein distance. Our distance function achieves higher accuracy in comparison to the $1$-Wasserstein, $2$-Wasserstein, and TV distances for image retrieval tasks on noisy real-world data sets.
Score: 8.176521989807748
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The $2$-Wasserstein distance is sensitive to minor geometric differences between distributions, making it a very powerful dissimilarity metric. However, due to this sensitivity, a small outlier mass can also cause a significant increase in the $2$-Wasserstein distance between two similar distributions. Similarly, sampling discrepancy can cause the empirical $2$-Wasserstein distance on $n$ samples in $\mathbb{R}^2$ to converge to the true distance at a rate of $n^{-1/4}$, which is significantly slower than the rate of $n^{-1/2}$ for $1$-Wasserstein distance. We introduce a new family of distances parameterized by $k \ge 0$, called $k$-RPW that is based on computing the partial $2$-Wasserstein distance. We show that (1) $k$-RPW satisfies the metric properties, (2) $k$-RPW is robust to small outlier mass while retaining the sensitivity of $2$-Wasserstein distance to minor geometric differences, and (3) when $k$ is a constant, $k$-RPW distance between empirical distributions on $n$ samples in $\mathbb{R}^2$ converges to the true distance at a rate of $n^{-1/3}$, which is faster than the convergence rate of $n^{-1/4}$ for the $2$-Wasserstein distance. Using the partial $p$-Wasserstein distance, we extend our distance to any $p \in [1,\infty]$. By setting parameters $k$ or $p$ appropriately, we can reduce our distance to the total variation, $p$-Wasserstein, and the L\'evy-Prokhorov distances. Experiments show that our distance function achieves higher accuracy in comparison to the $1$-Wasserstein, $2$-Wasserstein, and TV distances for image retrieval tasks on noisy real-world data sets.

Related papers

Nonparametric MLE for Gaussian Location Mixtures: Certified Computation and Generic Behavior [28.71736321665378]
We study the nonparametric maximum likelihood estimator $widehatpi$ for Gaussian location mixtures in one dimension. We provide an algorithm which for small enough $varepsilon>0$ computes an $varepsilon$-approximation of $widehatpi$ in Wasserstein distance in time. We also show the distribution of $widehatpi$ conditioned to be $k$-atomic admits a density on the associated $2k-1$ dimensional parameter space.
arXiv Detail & Related papers (2025-03-26T03:36:36Z)
Mean and Variance Estimation Complexity in Arbitrary Distributions via Wasserstein Minimization [0.0]
This paper focuses on the complexity of estimating translation translation $boldsymbolmu in mathbbRl$ and shrinkage $sigma in mathbbR_++$ parameters. We highlight that while the problem is NP-hard for Maximum Likelihood Estimation (MLE), it is possible to obtain $varepsilon$-approxs for arbitrary $varepsilon > 0$ within $textpoly left( frac1varepsilon )$ time using the
arXiv Detail & Related papers (2025-01-17T13:07:52Z)
Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality. We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z)
Relative-Translation Invariant Wasserstein Distance [82.6068808353647]
We introduce a new family of distances, relative-translation invariant Wasserstein distances ($RW_p$) We show that $RW_p distances are also real distance metrics defined on the quotient set $mathcalP_p(mathbbRn)/sim$ invariant to distribution translations.
arXiv Detail & Related papers (2024-09-04T03:41:44Z)
Optimal bounds for $\ell_p$ sensitivity sampling via $\ell_2$ augmentation [56.403488578703865]
We show that by augmenting the $ell_p$ sensitivities by $ell$ sensitivities, we obtain better bounds on optimal linear $tilde O(varepsilon-2mu2 d)$ sampling complexity. We also obtain an $tilde O(varepsilon-2mu2 d)$ sensitivity sampling bound for logistic regression.
arXiv Detail & Related papers (2024-06-01T07:03:40Z)
$\ell_p$-Regression in the Arbitrary Partition Model of Communication [59.89387020011663]
We consider the randomized communication complexity of the distributed $ell_p$-regression problem in the coordinator model. For $p = 2$, i.e., least squares regression, we give the first optimal bound of $tildeTheta(sd2 + sd/epsilon)$ bits. For $p in (1,2)$,we obtain an $tildeO(sd2/epsilon + sd/mathrmpoly(epsilon)$ upper bound.
arXiv Detail & Related papers (2023-07-11T08:51:53Z)
Beyond Moments: Robustly Learning Affine Transformations with Asymptotically Optimal Error [8.615625517708324]
We present a-time algorithm for learning an unknown affine transformation of the standard hypercube from samples. Our algorithm is based on a new method that iteratively improves an estimate of the unknown affine transformation whenever the requirements of the certificate are not met.
arXiv Detail & Related papers (2023-02-23T19:13:30Z)
The quantum low-rank approximation problem [0.0]
We consider the distance $D(rho,sigma)$ between two normalized quantum states, $rho$ and $sigma$. We analytically solve for the optimal state $sigma$ that minimizes this distance.
arXiv Detail & Related papers (2022-03-02T01:05:01Z)
TURF: A Two-factor, Universal, Robust, Fast Distribution Learning Algorithm [64.13217062232874]
One of its most powerful and successful modalities approximates every distribution to an $ell$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d_$. We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation.
arXiv Detail & Related papers (2022-02-15T03:49:28Z)
Sampling from Log-Concave Distributions with Infinity-Distance Guarantees and Applications to Differentially Private Optimization [33.38289436686841]
We present an algorithm that outputs a point from a distributionO(varepsilon)$close to $$ in infinity-distance. We also present a "soft-pi" version of the Dikin walk which may be independent interest.
arXiv Detail & Related papers (2021-11-07T13:44:50Z)
How many moments does MMD compare? [7.919213739992465]
mathcalF-1$ acts on smooth functions in the same way as an integral operator associated with $K$. We show that kernels defined by pseudo-differential operators are able to approximate uniformly any continuous Mercer kernel on a compact set.
arXiv Detail & Related papers (2021-06-27T16:44:17Z)
Few-Shot Learning via Learning the Representation, Provably [115.7367053639605]
This paper studies few-shot learning via representation learning. One uses $T$ source tasks with $n_1$ data per task to learn a representation in order to reduce the sample complexity of a target task.
arXiv Detail & Related papers (2020-02-21T17:30:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.